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METHODS FOR OPTIMIZATION OF GENE THERAPY BY RECURSIVE 
SEQUENCE SHUFFLING AND SELECTION 

The present application is a Continuation-In-Part application ("CIP") of U.S. 
patent application serial.no. ("USSN") 08/721,824. filedSeptember27, 1996^ which was 
converted to Provisional application serial no. 60/037,742, under 35 U.S.C. § i 1 1(b) and 37 
C.F.R;§ 1.53(b)(2); and a CIPofUSSN08/722,660; filed September 27, 1996. Each of the 
aforementioned applications is. explicitly incorporated herein by reference, in their entirety and 
for all purposes. - 

FIELD OF THE INVENTION 

The present invention applies the field of molecular genetics to the 
improvement of vectors and other nucleic acids for use in gene therapy! Improvement is 
achieved by recursive sequence recombination. 

BACKGROUND AND DESCRIPTION OF RELATED ART 

Gene therapy is the introduction of a nucleic acid into cells of a patient to 
express the nucleic acid for some therapeutic purpose, that is, the nucleic acid is itself used 
as a drug. For example, an appropriate gene can be delivered to a patient with a recessive 
inherited disease, such as cystic fibrosis, to correct the genetic defect and cure the disease 
state. In other applications, delivery of genes encoding a- toxin (e.g, diphtheria toxin; ricin; tk) 
can be used to kill cancer cells, and other genes can be specifically tailored to kill infectious 
organisms. Other applications include incorporation of regulatory sequences near 
endogenous genes. These different applications are directed to many different target cells 
with many modes of delivery (e.g., in vitro, ex vivo, in situ, intravenous, and germline 
modification). 

The power of gene therapy has led many large pharmaceutical manufacturers 
and several smaller biotechnology companies to devote substantial financial and technical 
resources to developing gene therapy as a viable therapeutic approach to treating human 
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diseases. Although simple in theory; gene therapy is not without technical difficulties. 
Development of any gene therapy requires identification of a cell type as a target, means for 
entry of DNA.into those cells, means for expressing useful levels of gene product over an 
appropriate time period, and avoidance of hostjmmune response to the, gene therapy agents. 

The requirements for any particular application vary greatly aiid profoundly 
influence the choice of vector to be developed and tested, Possible variables: in different 
applications include the; efficacy of gene transfer,, the efficacy of gene expression, the duration 
of gene expression, the feasibility of repeat dosing, and the ability to target appropriate cells 
and avoid'.inappropriate cells. Gbnfoundihg factorsjhat may arise include the inability of ■ 
virus of ddivery vehicle to enter into or integrate into the chromosomes of. particular cells, 
the shutdown of transcriptional promoters, the loss'of input DNA, the. destruction of treated 
cells, arid the neutralization of input virus or gene product. All of these factors depend on-ihe 

' choice of viral vector or non-viral deliyery'systcm and on the ability of the hos: lo respond to 
that virus or delivery systerji. , • ' . . _ , 

Most qf the cornponents currently available for constructing: gene therapy 
vectdrs vvere not evolved or developed, for gene'therapY. and thus may have many undesirable 
•features and may lack effiekcy in.thedesired gene therapy application. For example, most . 
eukaryotic viruses have evolved to optimize virulence and viral reproduction,. and most non-. 
vital DNA delivery systems were designed to be used fonexpenmental transfection in 
laboratbryconditions, not for administraiion td humans, ^ 

■ Solutions to thc above difficulties and inefficiencies are needed before gene 

therapy.becomes effecti ye for routine treatment of significant numbers of patients with 
common diseases. The present invcmion fulfiUs.this and other needs by providing inler alii: 

■ methods for improving vectors and other nucleic acids used in gene therapy by recursive 
sequenc'i* recorhbination. . . ,' '.\ .. [,^^^ 



SI IMM ARY o y THF ^NVFNTION 
The invention provides methods of evolving nucleic acids for use in gene 
therapy by recursive sequence recombination. The methods email recombining at least first 
and second forms of the segment differing from each other in at least two nucleotides, to 
produce a iibrary of recombinant segments. At least One recombinant segment from the 
library is then screened for a property useful in gene therapy. At least one recombinant 
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segment identified by the screening is then recombined with a further^form of the segment, 
the same or different from the first and second forms, to produce a further library of 
recombinant segments. The further library is then screened to identify at least one further • 
recombinant segment from the further library for improvement in the property useful for gene 
therapy. Further cycles of recombination and screening are performed as necessary until the 
. further recombinant segment confers a desired level of tJie property useful for gene therapy. 

In one embodiment, the invention provides for a method of modifying a 
nucleic acid segment for use in gene therapy by recursive sequence recombination, 
comprising the following steps: (1) recombining at least a first and a second form of the 
segment differing in at least two positions, to produce a first set of recombinant segments; 

(2) screening at least one recombinant segment for a property useful in gene therapy; 

(3) recombining at least one recombinant segment generated by steps (I) and (2) with a 
variant form of the segment, the same as or different from the first or second forms, to 
produce a second set of recombinant segments; and, (4) screening at least one recombinant 
segment from the second recombination. set for the properly useful for jgene therapy. In a 
further embodiment of this method, steps (1 ) to (4) are repeated until the recursively 
recombined segment confers thepropw.iy useful for gene therapy. In additional embodiments 
of this method the nucleic acid segment can be a viral nucleic acid segment, the viral nucleic 
acid segment can comprise a viral vector, or at least one recombining step occurs in vivo or in 
vitro. 

in one embodiment, the desired property to be acquired is improved viral titer. 
Here, the recombinant segments are screened as components of viruses by propagation of the 
viruses on cells for multiple generations and isolation of progeny viruses, the progeny viruses 
being enriched for viruses having recombinant segments conferring the property of improved 
titer. 

In a second embodiment, the desired property is improved viral infectivity. 
Recombinant segments can be screened as components of viruses by determining the 
percentage of a population of cells infected by a virus. 

In a third embodiment, the desired, property is improved expression of a gene 
within the nucleic acid segment. The recombinant segments can be .screened by detecting 
expression of the recombinant segments within cells. 
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In a fourth embodiment, the desired property is improved or alt^^^ 
resistance. The recombinant segments can be screened by exposing the cells to -the drug and 
selecting surviving cells, the surviving cells being enriched for recombinant segments' Having 
the property of improved or altered drug resistance. 

In a fifth embodiment, the desired property is improved or altered tissue , 

specificity: The recombinant segments can be screened as components of viruses by 
contacting the viruses, with a first population of cells for which the property of infectivity by 
the virus is desired and a second population of cells for which the property of infectivity by ^ 
the virus is not desired, and isolating progeny virus from the first population of cells, the . . v 
progeny viruses being enriched for recombinant segments conferring the property of , 
infectivity for the first subpopulatioh of cells. 

In a sixth embodiment,' the desired property is improved^packaging capacity of 
a viral capsid. The recombinant segments can be screened as components of viruses by ' ; 
propagating the viruses on cells and isolating progeny viruses containing the recombinant 
segments.- The packaging capacity of the viral capsid containing the recombinant segments is 
increased between successive screening steps. . 

In a seventh embodiment, the desired property is episomal retention. The cells ' 
containing the recombinant segments can be screened by propagating the ceils without 
selection for the recombinant segments and then propagating the cells with selection for the 
recombinant segments;the cells sup/ivingselection being enriched for cells harbori ; 

recombinant segments with the property of improved episomal retention. 

In an eighth embodiment, the desired property is reduced immunogenicity of 

the recombinant segments or an expression product thereof. The recombinant segments can 
be screened by introducing the recombinant segments into a mammal and' recovering 
surviving recombinant segments after a period of time. 
- ^ ^ ■ In a ninth embodiment, the desired-property is site-specific i 
recombinant segments can be screened by introducing them into cells and recovering a region 
of cellular DNA including the desired site of integration, the region being enriched for 
recombinant segments with the property of site-specific integration. 

In a tenth embodiment, the desired property is increased stability. The 
recombinant segments can be screened as components of viruses by subjecting the viruses to 
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destabilizing conditions and recovering surviving viruses, these viruses being enriched for 

recombinant segments conferring the property. 

In an eleventh embodiment, the property is capacity to confer cellular 

resistance to microorganism infection. Cells containing recombinant segments can be 

screened for capacity to survive infection by the microorganism. 

In a, twelfth embodiment, the methods evolve vectors for introduction into 

target cells in nonviral form. Recombinant segments can be selected by introducing the 

recombinant segments into a mammal, recovering cells from the mammal into which the 
segments are integrated and are expressed to produce the protein or antisense RNA, and ' 
recovering the recombinant segments from the cells. 

In a thirteenth embodiment, the invention provides methods of .improving 
adenoassociated viral proteins rep and cap for expression in a packaging eel! line. Cells 
containing recombinant segments of these genes are infected with a recombinant AAV 
(rAAV) containing a marker gene flanked by terminal repeat sequences (ITRs) and a helper 
virus, such as an adenovirus. The yield of progeny r.AAV and helper virus produced by. 
different cells are determined and cells having a high relative yield of rAAV to helper virus 
are selected. 

In a fourteenth embodiment, the nucleic acid segment comprises a coding 
sequence encoding a protein or antisense RNA, which can be expressed after integration of ^ 
the segment into genomic DN A of mammalian cells. 

In a fifteenth embodiment, the nucleic acid segment encodes a viral protein 
and the property, is capacity.of a cell line containing the nucleic acid segment to package viral 
DNA transfected into the cell line. 

In a sixteenth embodiment, the nucleic acid segment encodes a DNA binding 
protein, the property that is enhanced is uptake by a recipient cell of a vector encoding the 
DNA binding protein. . 

In a seventeenth embodiment, the invention provides an isolated recombinant 
0^-methylguanine-DNA methyltransferase (MGMT) enzyme, as illustrated in'Figure 5; with' 
the amino acid sequence of SEQ^D N6:2, encoded by the nucleic sequence of SEQ ID N0:1 
The enzyme can have at least one amino acid segment present in a natural human MGMT 
coding sequence and absent in a natural nonhuman MGMT coding sequence, and has at least 
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one.amino acid segment present in the natural ndnhuman MGMT coding sequence and absent 
in the natural human MGMT coding sequence. 

In another embodiment, the inventionprovides for a phagemid-adenovirus 
capable of generating single stranded DNA greater than l O kilobases comprising an 
adenovirus and a phage. fl replicatioh'Origin. 

RRTFFnESCRimON OF THE f igures! V 

' ^' Figure 1: Scheme for /r? v/rro shuffling; "recursive sequence recombination," 

^of genes; ' - - ■ ■ ' ■/ - ■ - , ■ - - / 

Figure 2: Scheme for. selecting DNA binding proteins conferring^enhanced 

DNA uptake by recipient cells. ' 

Figure 3: Oligonucleotides used to generate recombinant forms of MGMT . . 
using the recursive recombination methods of the invention. ^ 

Figure 4:' lllustrates' the natural diversity of five ' 
alkyltransferases - human, rat, mouse, hamster, and rabbit. This diversity was used to; 
■ generate sequence diversity in thcirnproved human MGMT gene. 
• Figure 5: Illustrates the nucleotide sequence (SEQ ID NG:1) and the amino ^ ■ 

acid sequence (SEQ ID N0:2) of the improved human M'GMT gene generated by the 
methods of the invention. . V* ^* ; * ' 

Fieure 6: Illustrates the construction 'of an novel adenoyirus-phagmid. • 

/' : - -pEFINlfiONS "... \ , 
The term "screening ' describes what is, in general, a tw^^^ 

which one first determines Which cells do and do not express a screening marker and then 
physicaiiy separates the cells having the desired' property. Selection is a form.of screening in 
which identificatiori^and physical separation are.achieved simultaneously by expression of^ 
selection marker, which, in some.genetic circumstances, allows cells expressing the marker to 
' survive while other cells die (or vice versa). Screening markers include luciferase, beta- 
galactosidase, and green fluorescent protein. Selection-markers include drug and toxin 
resistance genes. Although spontaneous selection can and does occur in the course of natural 
evolution, in the present methods selection is performed by man. 
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The term^'exogenous DNA segmenr refers to a DNA segment which is 
foreign or heterologous to the cell, or homologous to the cell but in a position within the host 
cell nucleic acid in which the element is not ordinarily found. Exogenous DNA segments are 
expressed to yield exogenous polypeptides. 

The term ''gene" is used broadly to referito any segment, of DNA associated 
with a biological function. Thus, genes include coding-sequences and/or the regulatory 
sequences required for their expression. Genes also include nonexpressed DNA segments 
that, for example, form recognition sequences for other proteins. 

The terms "percentage sequence identity," "sequence identity," "sequence 
similarity" or "structural similarity" are calculated or determined by comparing two optimally 
aligned sequences over the window of comparison, determining the number of positions at 
which the identical nucleic acid base occurs in both sequences to yield the number of matched 
positions, dividing the number of matched positions by the total number of positions in the 
window of .comparison. Optima! alignment of sequences for aligning a comparison window 
can be conducted by computerized implementations of algorithms GAP, BESTFIT, FASTA, 
and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer 
Group, 575 Science Dr., Madison, W'7. 

The term "naturally-occurring" is used to describe an object that can be found 
in nature as distinct from being artificially produced by man.. For example, a polypeptide or 
polynucleotide sequence that is present in an organism (including viruses) that can be isolated 
from a source in nature and which has not been intentionally modified by man in the 
laboratory is naturally-occurring. Generally, the term naturally-occurring refers to an object 
as present in a non-pathological (undiseased) individual, such as is typical for the species. 

The terms "isolated," "purified," or "biologically pure" refer to material which 
•is substantially; or essentially free from components which normally accompany it as found in 
its native. state. 

A nucleic acid is operably linked when it is placed into a functional 
relationship with another nucleic acid sequence. For instance, a promoter or enhancer is 
operably linked to a coding sequence if it increases the transcription of the coding sequence. 
Operably linked means that the DNA sequences being linked are typically contiguous and, 
where necessary to join two protein coding regions, contiguous and in reading frame. 
However, since enhancers generally function when separated from the promoter by several 
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kilobases and intronic sequences may be of variable lengths/some polynucleotide elements 
may be operably linked but not contiguous. 

A specific binding affinity between two molecules,, for example, a ligand and a 
receptor, means a preferential binding of one molecule for another in amixture of molecules. 
The binding of the molecules cari'be considered specific if the binding affinity is about I x ■ 
lO'M-' to about 1 X 10' M ' or greater. . . ■ '. 

f.^T^iiFnriFsrRiPTiQN , 
I, Oeneral - • ' : ■ ' - - : , 

... - : The invention provides methods of evolving, /. e., modifying, a nucleic acid for ■ 

the acquisition of or an- improvement in a property of characteristic -useful in gene therapy. ' 
The substrates for this modification, or evolution, vary in different applications, as does the . 
property sought to be acquired or improved. Examples of candidate subsi.^aies for acquisition 
of a property or improvement in a property include viral and non nonviral vectors used in 
gene therapy. ■ The methods require at least two variant fomis of a starting substrate. The 
'• variant forms of candidate substrates, can show substantial sequence or.secondary structural 
similarity with each other, but they should, also differ in at least two positions. The initial 
diversity between forms can be the result of natural variation, e.g., the different variant forms, 
(hbmoldgs) are obtained from different individuals or strains of an organism (including . ' 
geographic variants) or constitute related sequences from the same organism [e.g., allelic ' . • 
variations). Alternatively, the initial diversity can be induced, e.g., the second variant.fbrm 
can be generated by error-prone transcription, such as an enor-prpne PGR or use of a 
polymerase which lacks proof-reading activity (see Liao (1990) Gene 88:107-11 Ij, of the first 
variant fonn, or, by replication of the first form in a mutator strain (mutator host cells . are 

discussed in further detail below). The initial diversity. between substrates,is greatly . : 
augmented in subsequent steps of recursive sequence recombination. ^ , ,. ■ . 

The properties or characteristics that can be sought to be.acquired or improved 

vary widely, and, of course depend on .the choice of substrate. . For example, for viral and 
nonviral vector sequences, improvement goals include higher titer, more stable expression, 
improved stability, higher specificity targeting, higher frequency imegration, reduced 
immunogenicity of the vector sequence or an expression product thereof and higher 
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expression of gene products. For genomic DNA from a packaging cell line used to package a 
viral vector used in gene therapy, the goals of improvement include increasing the titer of 
viruses produced by the cell line. 

Improvement in a property or acquisition of a property is achieved by recursive 
sequence recombination. Recursive sequence recombination can be achieved in many 
different formats and permutations of formats, as described in further detail below. These 
formats share some common. principles. Recursive sequence recombination entails 
successive cycles of recombination to generate molecular diversity. That is, create a family of 
nucleic acid molecules showing some sequence identity to each other but differing in the 
presence of mutations. In any given cycle, recombination can occur in vivo or in vitro, 
intracellular or extracellular. Furthermore, diversity resulting from recombination can be, 
augmented in any cycle by applying prior methods of mutagenesis (e.g., error-prone PCR or 
cassette mutagenesis) to either the substrates or products for recombination. In some 
instances, a new or improved property or characteristic can be achieved after only a single 
cycle of //7 vivo or in vitro recombination, as when using different, variant forms of the 
sequence, as homologs from different individuals or strains of an organism, or related 
sequences from the same organism, as allelic variations.. 

A recombination cycie is usually followed by at least one cycle of screening or 
selection for molecules having a desired property or characteristic. If a recombination cycle 
is performed in vitro, the products of recombination, /.c, recombinant segments, arc 
sometimes introduced into cells before the screening step. Recombinant segments can also be 
linked to an appropriate vector or other regulator)' sequences before screening, Altemativeiy, 
products of recombination generated in vitro are sometimes packaged as viruses before 
screening. If recombination is performed in vivo, recombination products can sometimes be 
screened in the cells in which recombination occurred. In other applications, recombinant 
segments are extracted from the cells, and optionally packaged as viruses, before screening.. 

The nature of screening or selection depends on what property or characteristic 
is to be acquired or the property or characteristic for which improvement is sought, and many 
examples are discussed below. It is not usually necessary or desirable to understand the 
moleculiar basis by which particular products of recombination (recombinant segments) have 
acquired new or improved properties or characteristics relative to the starting substrates. For 
example, a gene therapy vector can have many component sequences each having a different 
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intended role (e.g.. coding seqdence, regulatory sequences, targeting sequences, stability- 
.confemng sequences, and integration sequences). Each of these coinponent sequences can be 

varied and recombined simultaneously. Screening/selection can then .be performed, for 

example, for recombinant' segments that. have, increased stable expression in a target celi" 
.Withbutlhe need to attribute such; inipfdvemeht to any of the individual-component sequences' • 

of the. vector. • . ' ', • ■ 

• • ' ■ ■ Initial round(s) of screening are often performed in bacterial ceils due to high ■ 
transfection efficiencies and ease of culture. Later rounds can be.perfontied in mammaiian . 

ceils to optimize recombinant segments for use-in an environment close to that pf their , _ 

intended use. Final rounds of screening can be performed in the precise . cell type of intended 

.. use (e^g., a stem cell). In some instances, this stem ceircan,be obtained from the.patienrto be. . 
treated with a view, for example, to minimizing problems of immunogenicity in. this patient., 
In some inethbds. use of a' genenherapy vector in treatment can itself be used as a round of 
screening.. That is, gene therapy vectors, that are successively taken up, integrated and/or ■ 
expressed by the intended target cells'in-one patietit are recovered frorn those target ceils and 
used to treat another patient.: The gene therapy vectors that arc recovered from the. iritended ; ■ , 
target cells in one patient are enriched for vectors that have evolved, i.e;, have been modified ■ 
by recursive recombination, toward Trnproved or new properties or characteristics for specific- 
uptake, integration and/or expression." ' ^ ..: ■ • . . ' " 

The screening or se lection step identifies a siibpopulatioh of recombinant 
segments that have evolved toward acquisition of a new or improved desired property or 

^^properties-useful in geiib therapy ,. Depending on the screen,,:,the recombinant s^ 
identified as compbnerits:of cells, components of viruses or in free form. More than one . : 
round of screening or selection can be performed after each round of recombinatiori. 
' • ' At least one and usuaUy a collection of recombinant^ , 

• screening/selection are subject to aTurthert^^^^^ These tecombinant; ; ^, 
- segments can be recombihed with each bther or with exogenous segments representing the 

original substrates or further variants thereof. Again, recombination can proceed in vitro or m 
vivo. If the previous screening step identifies desired recombinant segments as components 
■ of cells, the components can be subjected 'to further recombination in vivo, or can be 
subjected to further recombination in vitro, o: can be isolated before performing a round of in 
vitro recombination. Conversely, if the previous screening step identifies desired 
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recombinant segments in naked form or as components of viruses, these segments can be 
introduced into cells to perform around of m v/vo recombination. The second round of ■ 
recombination, irrespective how performed, generates further recombinant segments which 
encompass additional diversity than is present in recombinant segments resulting from 
previous rounds. . 

The second round of recombination can be followed by a further round of 
screening/selection according to the principles discussed above for the first round,. The 
stringency of screening/selection can be increased between rounds. Also, the nature of the 
screen and the property being screened for can var}' between rounds if improvement in more 
than one property is desired or if acquiring more than one new property is desired. Additional 
rounds of recombination and screening can then be performed until the recombinant segments 
have sufficiently evolved to acquire the desired new or improved property or function. 

II. Formats for Recursive Sequence Recombination 

Exemplar)' formats and examples for using recursive sequence recombination, 
sometimes referred to as DNA shuffling, sexual PCR or molecular breeding, have been 
described by the present inventors ana co-workers in copending application United States 
Serial No..(USSN) 08/621,859, attorney docket no. 16528A-014612, filed March 25, 1996; 
international application PCT/US95/02 1 26, filed February 1 7, 1 995, published as WO 
95/22625; Stemmer (1995) Science 270: 1510; Stemmer (1995) Gewe 164:49-53; Stemmer 
(1995) Bio/Technology 13:549-553; Stemmer (1994) Proa Natl. Acad. Sci USA 91:10747- 
10751; Stemmer (1994) yVa/wre 370:389-391; Crameri (1996) Nature Medicine 2:1-3; 
Crameri ( 1 996) Nature Biotechnology^ 14:315-319. 
- {\^ In Vitro Formats 

One embodiment for shuffling DNA sequences in vitro is illusurated in Fig. 1 . 
The initial substrates for recombination are a pool of related sequences, e.g,, different, variant 
forms, as homologs from different individuals or strains of an organism, or related sequences ■ 
from the same organism, as allelic variations. The X's in the Fig. I, panel A, show where the 
sequences diverge. The sequences can be DNA or RNA and can be of various lengths 
depending on the size of the gene or DNA fragment to be recombined or reassembled. 
Preferably the sequences are from 50 base pairs (bp) to 50 kilobases (kb). 
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■The pool of related substrates are converted into overlapping fragmemsv.e^^^^ 
from about 5 bp to 5 kb or more, as shown in Fig. 1 , panel B. Often,, for example, the .ize of 
the fragments is from about 10 bp to 1000 bp, and sonietimeS:the size ol the DNA fragments 
. ; is from aboutJ 00 bp to 500 bp/The conversion can be effected by a number of d^^ 
'5 ' ' methods, such as DNase 1 or RNAse digestion, random shearing or partial restriction enzyme 
■ . digestion. For discussipns of protocols for the isolation, manipulation, enzymatic digestion, - 
. andthe like ofnucleic acids, see, for example,' Sambr 

. Laboratory Manual (2ND ED.), Vols. 10, Cold Spring Harbor Laborator>s< f 

- : (Sambrook)v and. Current Protocols IN MoLEtuLAR B^i^ 

iO ■ Publishing and Wiley-lnterscienci:, New York (i987) (Ausubel). .The'concentration of 

' niieleic acid fragments of a particular length and sequenceis often less than 0.1'.%- or 1% by 

• weight ofthe total nucleic acd. The number of different specif.c-.nucleic acid fragments in 

■ themixtureisusuallyatleastabout 100,,,500or 1000. , > 

■ The mixed population of nucleic acid fragments are converted to at least ■ . . 
,15 . ^ partiany-sirigle-strandedform'using a variety of techniques;including, for exampl^ 

■ ' chemical denaturation. use erDNA.bindingproteinsJndihe like. Conversion can be ■ ' 

effected by heating to about 80"C to' 100»C, more preferably from 90°C to 96°e, to form ; 
: single-stranded nucleic aad fragments and then r^^^^ 
bytreatmentwithsinglerStrandedbNAbin^^^^^^ 
20 B/^c^cm.66:61-92)^^6rrecAprotein{see:Kiianitsa(199^^^^ 

. 94.7837.7840). Single-stranded nucleic acid fragments having regions 
. with other sihgle-stranded nucieic acid fragments eah-then be reannealed by cooling to .20°C 
' t6 75'»C,aiid preferably from 40°C to 65°C. Renaturatipn can be^a^^^ 

of.polyethylene glycol (PEG), other volume-excludiniz reagents or salt. The salt 
' '25 concenWation is preferably from OniM to 200 mM, more preferab 
^ ■ -^omH&rriM to lOO^mM; ;The salt mayAje KGlWNkGK' The concentration of 

preferably from ()% to 20%, more preferably from 5% to 1 0%. The fragments that ream^^^^ 
' . : can be fromdifferent substrates as shown in Fig. iVpanelC. The annealed nucleic acid ■ 

fragmems are incubated in the presence of a nucleic acid polymerase, such as Taq or Klenow 
3 0 and dNTPs {i.e. dATP, dCTP, dGTP and dTTP). If regions of sequence identity are large, 
Taq polymerase can be used with an annealing temperature of between 45-65°C. If the areas 
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of identity are small, Klenow polymerase can be used with an annealing temperature of 
. between 20-30''C. The polymerase can be added to the random nucleic acid fragments prior 
to annealing, simultaneously with annealing or after annealing. 

The process of denaturation, renaturation and incubation in, the presence of 
polymerase of overlapping fragments to generate a collection of polynucleotides containing 
different permutations of fragments is sometimes referred to as shuffling of the nucleic acid 
inyitro. This cycle is repeated for a desired number of times. Preferably the cycle is repeated 
' from 2 to 100 times, more preferably the sequence is repeated from 10 to 40 times. The 
resulting nucleic acids are a family of double-stranded polynucleotides of from about 50 bp to 
about 100 kb. preferably from 500 bp to 50 kb, as shown in Fig. 1, panel D. The population 
represents variants of the starting substrates showing substantial sequence identity thereto but 
also diverging at several positions. The population has many more members than the staning 
substrates. The population of fragments resulting from shuffling is used to transform host 
cells, optionally after cloning into a vector. 

In one embodiment utilizing />? v;7r<9 shuffling, subsequences of 
recombination substrates can be generated by amplifying the full-length sequences under 
conditions which produce a substantial fraction, typically at least 20 percent or more, of ^ 
incompletely extended amplification products. Another embodiment uses random primers to - 
prime the entire template DNA to generate less than full length amplification products. The 
amplification products, including the incompletely extended amplification products are % 
denatured and subjected to at least one additional cycle of reannealing and amplification. 7/ 
This variation, in which at least one cycle of reannealing and amplification provides a 
substantial fraction of incompletely extended products, is termed "stuttering." In the 
subsequent amplification round, the partially extended (less than full length) products 
reanneal to and prime extension on different sequence-related template species. In another 
enibodiment, the conversion of substrates to fragments can be effected by partial PGR 
amplification of substrates. 

In another embodiment, a mixture of fragments is spiked with one or more 
oligonucleotides. The oligonucleotides can be designed to include precharacterized mutations 
of a wildtype sequence, or sites of natural variations between individuals or species. The. 
oligonucleotides also include sufficient sequence or structural homology flanking such 
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mutations or variations to allow annealing with the wildtype fragments. Annealing 
temperatures can be adjusted depending on the length of homology. 

In a farther embodiment, recombination occurs in at least one cycle by 

template switching, such as when a DNA fragment derived from one template primes on the 
homologous position of a related but different template! Template switchingxan be induced 
by addition of recA.(see Kiianitsa (1997)^wpra), rad51 (see Namsaraev (]997) Mol. Cell, 
Biol 17:5359-5368), rad55 (see Clever (1997) £A/fi(9/ i6:2535-2544X rad57 (see Sung 
(1997) Gew^ Dt>v, 11:1 1 11-1121) or other polymerases (e.g., viral polymerases, reverse ^ 
transcriptase) to the amplification mixture. Template. switching can also be' increased by 
Increasing the DNA template concentration. 

Another embodiment utilizes at least one cycle of amplification, which can be 
conducted using a collection'of overlapping single-stranded DNA fragments of related 
sequence; and different lengths* Fragments can oe.prepared using a single stranded DNA 
phage, such as Ml 3 (see Wang (1997) Biochemisuy 36:9486-9492;); Each fragment can 
hybridize.to and prime polynucleotide chain.extension of a-second fragment from the 
collection, thus forming sequence-rccombined pclynucleotides:- In a further variation, ssDNA 
fragments of variable length can be generated from- a single primer by Pfii,.Taq, Vent, Deep. 
Vent, UlTma DNA polymerase or other DNA polymerases on a first DNA template (see 
CMQ{\996):Nucleic Acids Res, 24:3546-3551),. The single stranded DNA fragments are 
■used as primers, for a second, Kunkel-type template, consisting of a uracil-containing circular 
ssDN A. This results in multiple substitutions of the first template into the second. See:. 
_i:evichkin(1995) Mo/. B/o/o^q^ 29:572-577; Jung (1992) Ge^e 121:17^^-^^^ 
R^intrnduction of Genes Shuffled in vitro into^Cells 

In a further embodiment, whole cells and organisms can be improved by 
evolving a transgene within those cells and organisms.by recursive cycles of m v/7ra 
shuffling.-The transgene is subjected to -the recursive recombination methods of the. V 
invention, and the shuffled sequence library is put back into the cell/organism for selection. 
While this method is useftil if multiple copies of the modified transgene are reintegrated into 
a cell, in a preferred variation of this selection assay, only a single copy of the modified 
transgene is inserted into each cell. Another preferred variation of this selection assay 
involves reducing the transcriptional expression variability of the modified Uransgene that 
may result from differences in chromosomal location of integration sites. This requires a 
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means for defined, site-specific integration of the modified transgene. These methods can 
also be used to evolve an episomal vector (which can replicate inside the'cell) which can site- 
specifically integrate into a chromosome; , , ' 

Use of retroviruses to shuttle the modified transgene back into the cell for 
selection has the advantage that they integrate as a single copy. However, this insertion is not 
site-specific, /.e,, the retrovirus inserts in a random location in the chromosome. 
Adenoviruses and ars-plasmids are also used to shuttle modified transgenes, however, they 
integrate as multiple copies. While wild- type AAV integrates as a single copy in 
chromosome q 19, commonly used modified versions of AAV do not. Homologous 
recombination is also used to insert a modified recombinant segment (transgene) into a 
chromosome, but this method can be inefficient and may result in the integration of two 
copies in the pair of chromosomes. To solve these problems, one embodiment, of the 
invention utilizes site-specific integration systems to target the transgene to a specific, 
constant location in the genome. A preferred embodiment uses the Cre/LoxP or the related 
FLP/FRT site-specific integration system. The Cre/LoxP system uses a Cre recombina^e. 
enzyme to mediate site-specific insertion and excision of viral or phage vectors into a specific 
palindromic 34 base pair sequence c^!]ed a /'LoxF site." Lox P sites can be inserted to a,/, 
mammalian genome of choice, to create, for example, a transgenic animal containing theXox 
P site, by homologous recombination (see Rohlmann (1996) A'a/wre Biotech. 14:1562-1565). 
If a genome is engineered to contain a LoxP site in a desired location, infection of such cells 
with vectors carrying a gene for the Cre recombinase results in the efficient, site-specifie,-,- 
integration of the transgene-containing vector into the LoxP site. This approach is 
reproducible from cycle to cycle and provides a single copy of the modified transgene 
(recombinant sequence) at a constant, defined location., Thus, a transgene of interest can be 
modified using the recursive sequence recombination methods of the invention in vitro and 
reinserted into the cell for in vivo/in situ selection for the new or improved property in the 
optimal way with minimal noise; This technique can also be used /Vj v/vo, as discussed 
below. See,example, Agah (1997) J. Clin, Invest. \00:\69'm- Akagi (1997) Nucleic Acids 
Res: 25: 1 766-1 773; Xiao (1 997) Nucleic Acids Res 25:2985-2991 ; Jiang (1 997) Curr Biol 
7:321-R323, Rohlmann ( 1996) Nature Biotech. 14:1562-1565; Siegal (1996) Genetics 144: 
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715-726; Wild (1996) . Ge«e 179:181-188. The evolution of Cre is discussed in iiirther detail, 
below. 

(2) fn Vivn Formats 

• ( ^) Pla9mid-P| fi«mi^ Rpromhination . 

The recursive recombination methods of the invention include plastriid- 
plasmid recombinations. In this and other embodiments, the initial. substrates for 
recombination are a collection of polynucleotides comprising variant forms of nucleic acid of 
interest, such as a gene, a vector, a transcriptional regulatory sequence, or the like. The 
variant forms can have substantial sequence identity to each other; for example, sequence 
identity sufficient to allow homologous recombination. between substrates (see Datta (1997) 
Proc. Natl. Acad. Sci. USA 94:9757-9762; Shimizu (1 997) 1 Mol. Biol. 266:297-305; Walt 
(1985) Proc, Natl. Acad Sci. USA ^2:476^-4772). The diversity between the polynucleotides 
can be natural (e.g., allelic or species variants), induced (eg.. v,7ro generated, as by error- 
prone PGR, see Light -(1995) Bioorg.. Med Chem. 3:955-967), or the result of /« vitro ; 
recombination' Diversity can also result from resynthesizing genes encoding natural proteins 
with alternative and/or mixed codon usage. There should be at least sufficient diversity 
between substrates that recombination can. generate more diverse products than there are 
starting materials. There must be at least.two substrates differing in at least two positions.. 
However, in another embodiment, a library of substrates of lO^-VJ' members is employed. 
' The degree of diversity depends on the length of the substrate being recombined and the 
extent of the functional change to be evolved. Diversity at between O.l-SQo/o of positions is 
typical. 

The diverse initial substrates or recombinam segments modified by the 

methods of the invention can be incorporated into plasmids. In one embodiment, the 
plasmids are standard cloning vectors, eg., bacterial multicopy plasmids. However, in 
alternative embodiments, described below, the plasmids include mobilization functions. The 

initial substrates or recombinant segments can be incorporated into the.same or different 
• plasmids. Often at least two different types of plasmid having different types of selection 
marker are used to allow selection for cells containing at least two types of vector. Also, 
where different types of plasmid are employed, the different plasmids can come from two 
distinct incompatibility groups to allow stable co-existence of two different plasmids within 
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the cell. Nevertheless, plasmids from the same incompatibility group can still co-exist withiri 
the same cell for sufficient time to allow homologous recombination to occur, 

Plasmids containing diverse substrates are initially introduced into procaryotic 
or eukaryotic cells by any transfection methods, e.g,, chemical transformation, natural 
competence, electroporation, viral transduction or biolistics (see, for example, Sambrook for 
a detailed descriptions of introducing DNA into cells; Hapala (1 997) Crit. Rev. BtotechnoL 
17:105-122). Often, the plasmids are present at or near saturating concentration (with respect 
to maximum transfection capacity) to increase the probability of more than one plasmid 
entering the same cell. The plasmids containing the various substrates or recombinant 
segments can be transfected simultaneously or in multiple rounds. For example, in the latter- 
approach cells can be transfected with a first aliquot of plasmid. transfectants selected and 
propagated, and then infected with a second ahquot of plasmid. 

Having introduced the plasmids into cells, recombination between substrates 
to generate recombinant* genes or other nucleic acid segments occurs within cells containing 
multiple different plasmids merely by propagating, the plasmids in the ceils. However, cells 
that receive only one plasmid are unable to participate in recombination and the potential 
contribution of substrates on such plasmids to evolution (sequence modification) is not fully 
exploited, although these plasmids may contribute to new sequence diversity if they are 
propagated in mutator ceils (described below) or otherwise accumulate point mutations {i.e. 
by ultraviolet radiation treatment). The rate of evolution, /.c, modification of nucleic acid 
sequence by the methods of the invention, can be increased by allowing all substrates to 
participate in recombination. In one embodiment, this is achieved by subjecting transfected 
cells to electroporation. The conditions for electroporation are the same as those 
conventionally used for introducing exogenous DNA into cells (e.g.; 1,000-2,500 volts, 400 
|iF and a 1-2 mM gap). Under these conditions, plasmids are exchanged between cells 
allowing all substrates to participate in recombination. In addition the products of 
recombination can undergo farther rounds of recombination with each other or with the 
original substrate. 

In another embodiment, the rate of evolution, i.e., the rate of recursive 
sequence modification, can also be increased by use of conjugative transfer. To exploit 
conjugative transfer, substrates are cloned into plasmids having MOB genes, and tra genes 
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are also provided in cis or in trans to the MOB genes. The effect of conjugative transferis 
very similar to electroporation in that it allows plasmids to move between cells andallows 
recombination between any substrate, andthe products of previous recombination to occur ,, 
merely by propagating the culture. The details of how conjugative transfer is exploited in 
these vectors are discussed in more detail below (see also Cabezon (1997) Mol. Gen. Genet. 
254:400-406.) ' 

, . The rate of evolution can also be increased.by fusing cells to induce exchjinge 
of plasmids or chromosomes. Fusion can be induced By chemical agents, such as PEG, or ^ 
.viruses or viral proteins, such as innuenza virus hemagglutinin, HSV:1 gB and gp, or . 
fusigemc liposomes (see Dzau ( 1996) froc^^ Acad. Sci. Ltt'^ 93; 11,421-11425)^ 

■ .. The rate ofevolution can also be increased by use of mutator host cells; t.':g., 
bacterial Mat L, S, D, T. H mutator cells, insect (Drosophila) and raouse.mutator cells, and 
human cell lines with defective DNA repair mechanisms, such as those {:om Ataxia 

. /e/a«g/ec/a./a,patients, see Morgan'd 997) Cancer Res. 57:3386-5389; Greener (1997) Moi. 
Btotechnol. 7:l89-195;MasonCI997yOWci- 146:13.81-1397; .\ronshiam (1996) Nucleic 
Acids Res 24:?498-2504; Seong ( 1 995) Int. J. ■Radial. Oncoh Biol. Phys. 33 :869^14; Wu 
{\994r)J. Bacierjol. 1.76:5393-5400; Rcwinski (mi) Nucleic Acids ,Res.\5:i20S-S2\5: , 

■ Aizawa( 1986) JpM..y. Cancer 77:327-329, , ■ , , ,. : ' 
' ' The. time for vvnich cells are propagated- and recombination is alio wed to ■ 

. occur, of course, varies with the cell typc.but is generally not critical, because even a small 
degree of recombination can substantially increase diversity relative to the starting materials. 
Cells bearing plasmidsxontaimng recombined genes are' subject to screening or selection tor 
a desired function... For example, if the substrate being evolved contains a drug resistance 
gene, one selects for drug resistance. Cells surviving screening or selection can be subjected 
to one or more rounds of screening/selection followed by recombination or can be subjected 
directly to an additional round of recombination. 

The next round of recombination can be achieved by several .'different formats 
independently of the previous round. For examplejn one embodiment, a further round of 
recombination can be effected simply by resuming (repeating) the electroporation or 
conjugation.mediated intercellular transfer of plasmids described above. In another 
•embodiment, a fresh substrate or substrates, the same or different from previous substrates, 
can be transfected into cells surviving selection/screening. The new substrates can be 
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included in plasmid- vectors bearing a different selective (selection) marker(s) and/or from a 
different incompatibility group than the original plasmids. Selection markers confer a . 
selectable phenotype on transformed cells. For example, the marker may encode antibiotic 
resistance, particularly resistance to chloramphenicol, kanamycin, G418, bleomycin and 
hygromycin, to permit selection of those cells transformed with the desired DNA sequences, 
see for example, Blondelet-Rouault ( 1 997) Gene 1 90:3 1 5-3 1 7. Because selectable marker 
genes conferring resistance to substrates like neomycin or hygromycin can only be utilized in 
tissue culture, chemoresistance genes are also used as selectable markers in vitro and in vivo. 
Various target, cells are rendered resistant to anticancer drugs by transfer of chemoresistance 
genes encoding P-glycoprotein, multidrug resistance-associated protein-transporter, 
dihydrofolate reductase, glutathione -S-transferase, G 6-alkylguanine DNA aikyltransferase 
(Tano 0997) J. Biol. Chem. 272:13250-13254). oraldehyde reductase (Licht (1997) ^^em 
CW/^ 15: 104-1 11) and the like. 

As a further embodiment, cells sur\'iving selectioa^'screening can be 
subdivided into two subpopulations, and plasmid DNA from one subpopuiation transfected 
into the other, where the substrates from the plasmids from the two subpopulations undergo a 
further round of recombination. In either of the latter two embodiments, the rate of evolution 
can be increased by employing any of the techniques described above, including DNA 
extraction, electroporation, conjugation or use of mutator cells. In a still further embodiment, 
DNA from cells surviving screening/selection can be extracted and subjected to in vitro DNA 
shuffling. , 

After the second round of recombination, a further round of screening/ 
selection can be performed. In one embodiment, the screening or selection is performed 
under conditions of increased stringency. If desired, further rounds of recombination and 
selection/screening can be performed using the same strategies as used in the second round. 
With successive rounds of recombination and selection/screening, the surviving recombined 
substrates evolve toward acquisition of a desired phenotype or characteristic. Typically, in 
this emd other recursive recombination methods of the invention, the final product of 
recombination that has acquired the desired phenotype can differ from starting (initial) 
substrates at 0. 1 %-25% of positions. The methods of the invention can evolve/modify 
nucleic acid sequences at a rate orders of magnitude in excess (e.g., by at least 10-fold, 100- 
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fold, .IGOO-fold, or iO,000 fold) of the rate calculated for naturally acquired mutation (about 1 
mutation per lO"' positions per generation, see Anderson (1996) Proc. Natl. Acad. Sci. USA 
93:906-907). 

The recursive recombination methods of the invention include virus-plasmid 
recombinations. The strategy used for plasmid-plasmid recombination can also be used for 
other embodiments of the invention, including virus-plasmid recombination, or phage-plasmid 
recombination. Hov^ever, some additional comments particular to the use of viruses are 
appropriate: The initial substrates for recombination are cloned.into bqlh plasmid and viral 
vectors. It is usually not critical which ,substrate(s) are mserted imo the viral vector ;and 
which into the plasmid, although usually the viral vector should contain different substrate(s) 
from the plasmid. As before, the plasmid (and the virus) typically contains a selective 
marker. The plasmid and viral, vectors can both be introduced into cells by transfection as 
described above. However, a more efficient procedure is to transfect the cells with plasmid, 
select transfeclants and infect the transfectants with vihus. Because the efficiency of infection 
of many viruses approaches 100% of cells, most cells iransfccted and infected by this route 
contain both a plasmid and virus bearing different substrates. 

• Homologous recombination occurs between.plasmid and virus generating both 
recombined plasmids and recombined virus. For some viruses, su^h as filamentous phage, in 
which intracellular DNA exists in both; double-stranded and single-stranded forms, both can 
participate in recombination. Provided that the virus is not one that rapidly kills cells, 
recombination can be augmented by use of electroporation or conjugation to transfer plasmids 
between cells. Recombination can also be augmented for some types of virus by allowing the 
progeny virus from one cell to reinfect other cells. For some types of vims, virus mfected- 
cells show resistance to superinfection. However, such resistance can be overcome by 
' . infecling at high-muitiplicity and/orusing mutant.strains.pf the virus in^v^ 

superinfection is reduced. 

The result of infecting plasmid-conlaining cells with virus depends on the 

nature of the virus. Some viruses, such as filamentous phage, stably exist with a plasmid in 
the cell and also extrude progeny phage from the cell (see Russel (1997) Gene 192:23-32). 
Other viruses, such as lambda having a cosmid genome, stably exist in a cell like plasmids 
without producing progeny virions. Other viruses, such as the T-phage and lytic lambda, 
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undergo recombination with the plasmid but ultimately kill the host cell and destroy plasmid 
DNA. For viruses that infect cells without killing the host, cells containing recombinant 
piasmids and virus can be screened/selected using the same approach as for plasmid-plasmid 
recombination. Progeny virus extruded by cells surviving selection/screening can also be 
collected and used as substrates in subsequent rounds of recombination. J^or viruses that kill 
their host cells, recombinant genes resulting from recombination reside only in the progeny 
virus. If the screening or selective assay requires expression of recombinant genes in a cell, 
. the recombinant genes should be transferred from the progeny virus to another vector, e.g., a 
plasmid vector, and retransfected into ceils before scieclionyscreening is performed. 

For filamentous phage, the products of recombination are present in both cells 
surviving recombination and in phage extruded from these cells. The dual source of . 
recombinant products provides some additional options relative to the plasmid-plasmid 
recombination. In one embodiment. DNA can be isolated from phage particles for use in a 
round of />? viiro recombination. In an alternative embodiment, the progeny phage can be 
used to transfect or infect cells surv'iving a previous round of screening/selection, or fresh 
cells transfected with fresh substrates for recombination, 
(g.) Virvg-Vir^s Recombination 

The recursive recombination methods of the invention also include virus-virus 
recombinations. The principles described for plasmid-plasmid and plasmid-viral 
recombination can be applied to virus-virus recombination with a few modifications. The 
initial substrates for recombination'are cloned into a viral vector. In a preferred embodiment,, 
the same vector is used for all substrates. Preferably, the virus is one that, naturally or as a 
result of mutation, does not kill cells. After insertion, some viral genomes can be packaged in 
vitro. The packaged viruses are used to infect cells at high muldplicity such that there is a 
high probability that a cell receives multiple viruses bearing'different substrates. 

After the initial round of infection, subsequent steps depend on the nature of 
infection, as discussed in the previous section. For example, if the viruses have phagemid 
genomes stxch as lambda cosmids or M 1 3 , F 1 or Fd phagemids, the phagemids behave as 
piasmids within the cell and undergo recombination simply by propagating the cells. 
Recombination is particularly efficient between single-stranded forms of intracellular DNA. 
Recombination can be augmented by electroporation of cells. Following selection/screening, 
cosmids containing recombinant genes can be recovered from surviving cells (e.g. , by heat 
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inductkia of aW lysogenic hb^ cell);repackaged /« w/r«„and used^to inftct fresh cells at 

high multiplicity for a further round of recbrnb^^ , • 

. Ifthe viruses are filamentous phage, recombination of replicating forrnDNA ' 

occurs by propagating the culture of infected cells. Selection/screenmg. identifies colonies of , 
■ceHs coiitainihg viral vectors havmg recombinant genes with^ 
with phage extruded from such cells. Subsequent options are essentially the, same as for 

■plasmid-viral recombination; . - - • /■■ ■^ 

.. (rj^ ri^^^^^Qorrtp-Plasmid Recombination ' ' i 

The recursive recombinationmethods^ofthe^nv^ . ., V; 

chromosome-piasmid recombinations. This format can be used to evblve both the 
. chromosomal and plasmid^bome substrates. The format .s panicularly useful in situations m 
whibh many 'chromosomal genes contribute: to a pheno.ype or one does not know the exact 
'locatiiSn of the chromosomal eene(s) to be evolved. The initial, substrates for recombination 

■ are cloned' into a plasmid vector. • If the chromosomai eene(s) to bc evolved are known.the ; 
substrates constitute a family of sequences showing a high degree of sequence identity but 
some divergence from the chromosomal gene. ' If the chromosomal genes to be evolved have : 
riot been located, the initial substrates usually const.tute a library of DN A segments of which 

■ only a small Wmber show sequence identity to the gene or gene(s) to be evolved. Divergence 
between plksmi*bome substrate and the chromosomal geneCs) can be induced by ^ . 
mutagenesis or by obtaining the plasmid-bome substrates from a different species than that of 
the cells bearing thexhromosome. as discussed above.. ^ , ■ " ; 

The plasmids bearing substrates for recoirtbination are transfected into cells ;. 
^ having cluomosomal gene(s).t6 be evolved/ modified to acquire a new or modified property. 
Evolution by recursive recombination can occur simply by propagating the culture. In . 

• another Embodiment, the nucleic acid sequence modification canbe accelerated by 
-transfferriniplasmidsbetweencellsbyconjug^^^ 

■ embodiment, evolution by recursive recombination can be further accelerated by use of . 

• mutator host cells or by seeding a culture of nonmutator host cells being evolved with mutator 
host cells and inducing intercellular transfer of plasmidsby electroporation or conjugation.' 
Preferably, mutator host cells used for seeding contain a negative selection marker to 
facilitate isolation of a pure culture of the nomnutator cells being evolved. 
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Selection/screening identifies cells bearing chromosomes and/or plasmids that have evolved 

toward acquisition or modification of a desired property or function. 

Subsequent rounds of recombination and selection/screening proceed in 

similar fashion to those described for plasmid-plasmid recombination. For example, further 

5 recombination can be effected by propagating ceils surviving recombination in combination 

« 

with electroporation or conjugative transfer of plasmids. Alternatively, plasmids bearing 
additional substrates for recombination can be introduced into the surviving cells. Preferably, 
such plasmids are from a different incompatibility group and bear a different selective marker 
than the original plasmids to allow selection for cells containing at least two different 
10 plasmids. As a further alternative, plasmid and/or chromosomal DNA can be isolated from a 
subpopulation of surviving cells and iransfected into a second subpopulation. Chromosomal 
DNA can be cloned into a plasmid vector before transfection. 
fe) Virus-Chromosom e Recombination 

The recursive recombination methods of the invention also include 
15 chromosome-virus recombinations. As in previously described embodiments, the virus is 
usually one that does not kill the cells, and is often a phage or phagemid. The procedure is 
substantially the same as for plasmid-wnromosome recombination. Substrates for 
' recombination are cloned into the vector. Vectors including the substrates can then be 
transfected into cells or />? vitro packaged and introduced into cells by infection. Viral 
2 0 genomes recombine with host chromosomes merely by propagating a culture. Evolution can 
be accelerated by allowing intercellular transfer of viral genomes by electroporation, or 
reinfection of cejis by progeny virions. Screening/selection identifies cells having 
chromosomes and/or viral genomes that have evolved toward acquisition of a new or 
modified property or desired function. 

2 5 There are several options for subsequent rounds of recombination. For 

example, viral genomes can be transferred between cells surviving selection/recombination 
by electroporation. Alternatively, viruses extruded from cells surviving selection/screening 
can be pooled and used to superinfect the cells at high multiplicity. Alternatively, fresh 
substrates for recombination can be introduced into the cells, either on plasmid or viral 

3 0 vectors. 
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TIT. Vectors I Ised in Gene Therapy ■ 
' . The invention provides for methods of modifying a vector by recursive 

recombination for use in gene therapy. Broadly speaking, a gene therapy vector is an 
exogenous polynucleotide which produces a medically useful phenotypic effect upon the 
mammalian cell(s) into which it is transferred.-. A vector may or may not have an origin of , 
replication. For example, it is useful to include an origin of replication in a vector for 
propagation of the vector prior to administration to a patient. However, the origin of. , 
replication can often be removed before administration if the vector is designed to integrate • 
into host chromosomal DN A or bind to host mRN A or DN A. Vectors used in gene therapy, 
can be viral or nonviral. Viral vectors are usually introduced.into a patient as components of 
a virus. Illustrative vectors incorporating nucleic acids to be modified by the recursive 
recombination methods of the invention include, for example, adenovirus-based vectors 
(Caniwell (1996) Blood 88:4676-4683; Ohashi (1997) Prcc NatlAcad Sci USA 
94: 1287-1292), Epstein-Barr virus-based vectors (Mazda (1997) J Immunol Methods 
204:143-151), adenovirus-associaled virus vectors. Sindbis virus vectors (Strong (1997) Gene 
Ther 4: 624-627), herpes simplex virus vectors (Kennedy ( 1 997) Brain 1 20: 1 245- 1259) and ' 
retroviral vectors (Schubert (1997) Ci/rr £ye/?e.v 16:656-662) . , • 

■ Nonviral vectors, typically dsDNA, can be transferred as' naked- DNA or 
associated.with a transfer-enhancing vehicle, such as a receptor-recognition protein, 
liposome, lipoamine, or cationiclipid. This DNA can be transferred into a cell using a 
variety of techniques well known in.the art. ' For example, naked DNA can^ be delivered by 
the use of liposomes which fuse with the cellular membrane or are endocytosed. i.e.. by ■ • 
erriploying ligands attached to the liposome, or attached directly to the DNA, that bind to 
surface membrane protein receptors of the cell resulting, in endocytosis, Alternatively, the . 
cells may be permeabilized to enhance transport of the DNA into the cell, without injuring the 
host cells. One can use a DNA binding protein, e.g.VHBGFM. known- to transport DNA ihto. 
a cell. These procedures for delivering naked DNA to ccllis are useful i>7 vivo. For example, 
by using liposomes, particularly where the liposome surface carries Uganda specific for target 
cells, or are otherwise preferentially directed to a specific organ, one may provide for the 
introduction of the DNA into the target cells/organs in vivo. 
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A. Viral-Basgd Mgthpcjs 
• ' Various viral vectors,- such as retroviruses, adenoviruses, adenoassociated 
viruses and herpes viruses, are used in gene therapy. They are often made up of two, 
components, a modified viral genome and a coat structure surrounding it (see generally Smith 
(1995) Annu. Rev, Microbiol. 49, 807-838), although sometimes viral vectors are, introduced 
in naked form or coated with proteins other than viral proteins. Most current vectors have 
coal structures similar to a wildtype virus. This structure packages and protects the viral 
nucleic acid and provides the means to bind and enter target cells. However, the viral nucleic 
acid in a vector designed for gene therapy can be changed in many ways. The goals of these 
changes are to disable growth of the virus in target cells while maintaining its ability to grow 
in vector form in available packaging or helper cells, to provide space within the viral 
genome for insertion of exogenous DNA sequences, and to incorporate new sequences that 
encode and enable appropriate expression of the gene of interest. Thus, vector nucleic acids 
generally comprise two components; essential m-'aciing viral sequences for replication and 
packaging in a helper line and the transcription unit for the exogenous gene. Other viral' 
functions are expressed in trans in a specific packaging or helper cell line. 
(1) Retrovirus?? , , 
Retroviruses comprise a large class of enveloped viruses that contain single- 
stranded RN A as the viral genonrie. During the normal viral life cycle, viral RNA is reverse- 
transcribed to yield double-stranded DNA that integrates into the host genome and is 
expressed over extended periods. As a result, infected cells shed virus continuously without 
apparent harm to the host cell. The viral genome is small (approximately 10 kb), and its 
prototypical organization is extremely simple, comprising three genes encoding gag, the 
group specific antigens or core proteins; pol, the reverse transcriptase; and env, the viral, 
envelope protein. The termini of the RNA genome are called long terminal repeats (LTRs) 
and include promoter and enhancer activities and sequences involved in integration. The 
genome also includes a sequence required for packaging viral RNA and splice acceptor and 
donor sites' for generation of the separate envelope mRN A. Most retroviruses can integrate 
only. into replicating cells, although human immunodeficiency virus (HIV) appears to be an 
exception. This property can restrict the use of retroviruses as vectors for gene therapy. 

Retrovirus vectors are relatively simple, containing, the 5' and 3' LTRs, a 
packaging sequence, and a transcription unit composed of the gene or genes of interest, which 
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is .typically an expression cassette. . To grow such a vector, one must provide the missing viral 
ftinctibns in trans using a so-called packaging cell line. Such a cell is engineered to contain 
integrated copies of gag, pol, and env .biit to lack a packaging signal so that no helper virus 
"sequences become encapsidated. Additional features added to or removed from the vector 
5 ■ ' . and packaging cell line reflect attempts to render the vectors more efficacious or reiuce the 
possibility of contamination by helper virus. ' ' / . 

■ . The itiain advantage of retroviral vectors is that they integrate in the 

chromosome and are therefore potentially capable of long-term expression. They can be , 
grown in relatively large amounts, but care is needed to ensure the absence of helper virus. 
10 (?") Adenoviruses • . . .. 

■ , Adenoviruses comprise a large class of nonenveloped, viruses containing linear 
- ■ double-stranded DNA. The normal life cycle of the virus does not require dividing cells and 
involves produciive infection in permissive. cells during which large amounts of virus ■ 
accumulate. The productive infection cycle takes about 32-3,6 hours in ceil culture and. 
,15 comprises two phases, the early phase, prior to viral DNA synthesis, and the late phasev ^ . 
during which structural proteins arid. viral DNA are synthesized and assembled into virions. 
In general, adenovirus infections are associated with mild disease in humans. 

Adenovirus vectors are somewhat larger and more complex than retrovirus or 

AAV vectors; partly because only 'a small fraction of the viral genome is.removed from most 
2o" ■ curfent. vectors. If additional genbs are; removed, they are provided in trans to produce the 
vectbr>hich so far has proved difficult, to^^^ 

vectors have been studied. E3-deletion and El -deletion-vectors. Some viruses in laboratory 
sto'cks of wild-type lack the E3 region and can grow in the absence of helper. This ability 
does hot mean that the E3 gene products are not necessary in the wildv only that replicat^^ 

2 5 cultured cells does r^ot require them. Deletion of the E3 region allows insertion of exogenous 

■ DNA sequences to yield vectors capable of productive infection and the transient synthesis of 

relatively large amounts oif encoded protein. V 

Deletion of the El region disables the adenovirus, but such vectors can still be 

growm because there exists an established human cell line (called "293") that contains the El 

3 0 regionof Ad5 and that cohstitutively expresses the El proteins. Most recent gene-therapy 

applications involving adenovirus have utilized El replacement vectors grown in 293 cells. 
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- The main advantages of adenovirus vectors are that they are capable of 
efficient episomai gene transfer in a wide range of cells and tissues and, that they are easy to 
grow in large amounts. The main disadvantage is that the host response to the virus appears 
to limit the duration of expression and the ability to repeat dosing, at least ;^th high doses of 
first-generation vectors. * - 

In another embodiment, the recursive recombination methods of the invention 
are used to construct a novel adenovirus-phagmid capable of packaging DNA inserts over 10 
kilobases in size. Incorporation of a phage fl origin in a plasmid using the methods of the 
invention also generates a novel in vivo shuffling format capable of evolving whole genomes 
of viruses, such as the 36 kb family of human adenoviruses. The widely used human 
adenovirus type 5 (Ad5) has a genome size of 36 kb. It is difficult to shuffle this large 
genome in vitro without creating an excessive number of changes which may cause a high 
percentage of nonviable recombinant variants. To minimize this problem and achieve whole 
genome shuffling of Ad5, an adenovirus-phagemid was constructed. The invention's Ad- 
phagemid has been demonstrated to accept inserts as large as 1 5 and 24 kilobases and to 
effectively generate ssDNA of that size. In a further embodiment, larger DNA inserts, as large 
as 50 to 100 kb are inserted into the .-A^-phagemid of the invention; with generation of full 
length ssDNA corresponding to those large inserts. Generation of such large ssDNA 
fragments provides a means to evolve, i.e. modify by the recursive recombination methods of 
the invention, entire viral genomes. Thus, this invention provides for the first time a unique 
phagemid system capable of cloning large DNA inserts (>10 KB) and generating ssDNA in 
vitro and in vivo corresponding to those large inserts. 

The genomes of related serotypes of human adenovirus are shuffled in vivo 
using this unique phagmid system, as described in Example 4.and illustrated in Figure 6. The 
genomic DNA is first cloned into a phagemid vector, and the resulting plasmid, designated a 
''Admid," can be used to produce single-stranded (ss) Admid phage by using a helper M 1 3 
phage. To achieve in vivo recombination, ssAdmid phages containing the genome of 
homologous human adenoviruses are used to perform high multiplicity of infection (MOI) .on 
F+ mutS E. coli cells. The ssDN A is a better substrate for recombination enzymes such as 
RecA. The high MOI ensures that the probability of having multiple cross-overs between 
copies of the infecting ss Admid DNA is high. The shuffled adenovirus genome is generated 
by purification of the double stranded Admid DNA from the infected cells and is introduction 
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into a permissive human cell line to produce the adenovirus library. This genomic shuffling 
strategy is useful for creation of recombinant adenovirus variants with, changes in muitiple 
genes. This allows screening or selection of recombinant variant phenotypes.resulting from 
combinations of variations in multiple genes. - 

' ; ■ Adeno-A f;f?r"r''^^P^ Viru'; (AAW ' ; _ -, 

. . yVAV is a small; simple, nonautonomous virus containing linear single- , 

stranded DN A. ., See Muzycka, Current Topics Microbiol- Immunol. 158, 97-129 (1992). The 
virus requires co-infection with adenovirus or certain other viruses in order to replieate. AAV 
is. widespread in the human population; as evidenced by antibodies to the virus, but it isnot 
associated with any known disease: AAV genome organization is straightforward, 
comprising only two genes: re;? and cap. The termini of the genome comprises terminal ^ 
repeats(lTR) sequences of about 145 nucleotides. , ■ , , 

, AAV-based vectors typically contain only the ITR sequences flanlcing the 
transcription unit of interest. The length of the vector DNA cannot greatlv exceed the viral 
genome length of 4680 nucleotides. Currently, growh of AAV vectors is.cumbersome and 
involves introducing into the host cell not only the vector.itself but.also a plasmid encoding 
, rep and cap to provide helper functions. The helper piasmidlacks ITRs and consequently 
cannot replieate and package. In addition, helper virus such as adenovirus is often required. 
The potential advantage of AAV vectors is that they appear cap^'ole of long-term expression 
in nondividing cells, possibly, though not necessarily,. because the viral DNA integrates. The . 
vectors are structurally simple, and they may therefore provoke less of a host-cell, response 
than adenovirus.- A major limitation at present is that AAV vectors are extremely difficult to 

' grow in large amounts, 

R ^sinn.Vjral fiene T r^'n^fer Methods 

Nonviral nucleic acid vectors used in gene therapy include plasmids, RJ^As, 
, antisense,oligonucleotides (e g , inethylphosphonate or phosphorothiolate), polyamide nucleic 
acids, and yeast artificial chromosomes (YACs). Such vectors typically include an expression 
cassette for expressing a protein or RNA. The promoter in such.an expression cassette can be 
constitutive, cell type-specific, stage-specific, and/or modulatable (e.g., by hormones such as 
glucocorticoids; MMTV promoter). Transcription can be increased by inserting an enhancer 
sequence into the vector. Enhancers are cis-acting sequences of between 10 to 300 base pairs 
that increase transcription by a promoter. Enhancers can effectively increase transcription 
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when either 5' or 3' to the transcription unit. They are also effective if located within an 
intron or within the coding sequence itself Typically, viral enhancers are used, including 
. SV40 enhancers, cytomegalovirus enhancers, polyoma enhancers, and adenovirus enhancers.. 
Enhancer sequences from mammalian systems are also commonly used, such as the mouse 
immunoglobulin heavy chain enhancer. 

Gene therapy vectors of all kinds can also include a selectable marker gene. 
Examples of suitable markers include, the dihydrofolate reductase gene (DHFR), the 
thymidine kinase gene (TK), or prokaryotic genes conferring drug resistance, gpt (xanthine- 
guanine phosphoribosyltransferase, which can be selected for with mycophenolic acid; neo 
(neomycin phosphotransferase), which can be selected for with G418, hygromycin, or 
puromycin; and DHFR (dihydrofolate reductase), which can be selected for with methotrexate 
(Mulligan & Berg, Proc. Natl. Acad, Sci (U.S.A.) 78, 2072 (1981); Southern & Berg, J. Moi. 
Appl. Genet. 1,327 (1982)). 

Before integration, the'vcctor has to cross many barriers which can result in 
only a very minor fraction of the DNA ever being expressed. Limitations to high level gene 
expression include: loss of vector due to nucleases present in blood and tissues; inefficient 
entry of DNA into a cell; inefficient entry of DNA into the nucleus of the cell and preference 
of DNA for other compartments; lack of DNA stability in the nucleus (factor limiting nuclear 
stability may differ from those affecting other cellular and extracellular compartments), 
efficiency of integration into the chromosome; and site of integration. 

These potential losses of efficiency can be addressed by including additional 
sequences in a nonviral vector besides the expression cassette from which the product 
effecting therapy is to be expressed. The additional sequences can have roles in confening - 
stability both outside £md within a cell, mediating entry into a cell, mediating entry into the 
nucleus of a cell and mediating integration within nuclear DNA. For example, aptamer-iike 
DNA structures, or other protein binding sites can be used to mediate binding of a vector to 
cell surface receptors or to serum proteins that bind to a receptor thereby increasing the 
efficiency of DNA transfer into the cell. 

Other DNA sequences can directly or indirectly result in avoidance of certain 
compartments and preference for other compartments, from which escape or entry into the 
nucleus is more efficient. Other DNA sites and structures directly or indirectly bind to 
receptors in the nuclear membrane or to other proteins that go into the nucleus, thereby 
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facilitating nuclear uptake of 'a vector. . Other DNA sequences directly or indirectly affect the 
efficiency of integration. For integration by homologous recombination,. important factors are 
the degree and length of homology to chromosomal sequences, as well.as the frequency of 
^suchseqiiences'in the genome (e;g., alu repeats). The specific sequeiice mediating ' ■ 
homologous recombination is also- important, since integration occurs much more easily in 
transcriptionally active DNA. Methods an'd materials for constructing homologpu.s targeting 
eonstructs-are described by e.g ,>Mansour (1-988) Ndiure336m^ Bradley (1992), ■ ; 

■Bid/Technology iOSS^- . < - 

. iFor honhbmplogous, illegitimate Ahd.site^ ,. 
recombination is mediated by specific sites on the therapy vector. v^^hich interact with cell . ' 
encoded recombination proteins, e.g., Cre/Lox and Flp/Frt^ystems, as discussed above for in 
v//ro systems. See also ^aubon.s (199.1) Nuckic Acid!^ Res. lX ;2025-2029: which reports that 
a vector including a LoxP site becomes integrated afa LoxP site in' chromosomal DNA in the. 
pres;ence'ofCre recorhbinase enzyme. ' 

' : 'Nonviral vectors encoding products'useful in gene therapy can be introduced ■ 
into an animal by means such as lipofectiGn, biolistics,.virosomes, lipo^^^ , 
immuholiposomes,-polycation;nueleic:aciacGnjugates.^ 

■ enhanced uptake of DNA. ex v/vo transduction. Lipofeclion is;described in,e.g.. US, . 
5,049,386, lis 4,946,787; and US 4,897,355): and lipofection reagents are sold commercially 
-te.g., Transfectam"^^ and LipofectinTW).,: Caiionic and neutral' lipids that are'suitable for , 

■ efficient receptor-recognition lipofection of polynucleotides include those, of Feigner, , WO 

: ^'l/l-7424v WO"91716024.' ^ ; • ; ■ ■ . . 

•Unlike existing viral-bascd gene therapy vectors, which can only incorporate a 
relatively small non-viral-polynucleotide sequence into .the viralgenorpe because of sizb , 
jimitatiGhs for packaging virion particles, naked DNA orlipofectioncomplexes can be used 
^ tatransfer large:(e.g.,;5Q.5,QOO^kb) exogenous polynucleotides, into celh of 
• nonviraf vectors is particularly advantageous since many genes which can be deliyered by. 
therapy span over 100 kilobases (e.g.,,. amyloid precursor protein (APP) gene, Huntington's 
chorea gene) and large homologous targeting constructs or transgenes can be required for 
efficient integration. Optionally, such large genes can be delivered to target cells as two or 
more fragments and reconstnicted by homologous recombination within a cell (see WO 
92/03917). 
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V- ^ C. Appligqtigns of ggng Thgrapy 

, . Gene therapy vectors can be delivered in vivo by administration to an - 
individual patient, typically by systemic administration (e.g., intravenous, intraperitoneal, 
intramuscular, subderrnal, or intracranial infusion) or topical application. Alternatively, 
vectors can be delivered to cells ex vivo, such as cells explanted from an individual patient 
(e.g., lymphocytes, bone marrow aspirates, tissue biopsy).or universal donor hematopoietic 
stem cells, followed by reimplantation of the cells into a patient, usually after selection for 
cells which have incorporated the vector. 

An important application is the treatment of congenital disease, particularly in 
patients lacking both wildtype alleles of a recessive gene. The vector introduces a wildtype 
allele of the gene that allows synthesis of the corresponding gene product compensating for 
the absence of this product in the patient. Examples of recessive diseases include sickle cell 
anemia, beta-thalassemia, phenylketonuria, galactosemia, Wilson's disease, hemochromatosis, 
severe combined. immunodeficiency, alpha- l-antitr\psin deficiency, albinism, alkaptonuria, 
lysosomal storage diseases, Ehlers-Danlos syndrome, hemophilia, agammaglobuiimenia, 
diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy, Wiskott-Aldrich syndrome, 
Fabry's disease and fragile X-syndrc^e, 

Another application of gene therapy is to introduce a gene that increases the 
resistance of a cell to infection by pathogenic organisms, The gene can encode an antisensc 
RNA to a sequence in the microorganism not found in the patient's genome. Alternatively, 
the gene can encode a protein inhibitory to the microorganism. Examples of microorganisms 
that can be inhibited by gene therapy include viral diseases (e.g., hepatitis (A, B, or C), herpes ■ 
virus (e.g., VZV, HSV-1, HAV-6, HSV-II, CMV, and EBV), HIV," adenovirus, influenza 
virus, "flavivihises, echbvirus, i'hinovirus, coxsackie virus, cornovirus, respiratory syncytial " * 
virus, mumps virus, rotavirus, measles virus, rubella virus, parvovirus, vaccinia virus, HTLV 
virus, derigue virus, papillomavirus, molluscum virus,. poliovirus, rabies virus, JC virus and 
arboviral encephalitis virus) and pathogenic bacteria (e.g., chlamydia, rickettsial bacteria, 
mycobacteria, staphylococci, streptococci, pneumonocpcci, meningococci and conococci, 
klebsiella, proteus, serratia, pseudomonas, legionella, diphtheria, salmonella, bacilli, cholera, 
tetanus, botulism, anthrax, plague, ieptospirosis, and Lymes disease bacteria). For example, 
the HIV sequences Tat and Rev (Malirn et al., Nature 338, 254 (1989)) are suitable targets for 
antisense RNAs or RNA binding proteins. 
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A further application of gene therapy in the dd 
•to noncancerous cells in a patient with a view to increasing sel^^^^ 

cancer cells in the patient. For example/poiynuclebtides:conferring.res.st^^ ' , 

•chemotherapeutic agent (eig., an expression cassette driving constitutiv^ e^^^^ qf the ^ 
hALDH-l or hALDH-2 gene can confer fesistahce to cyclophosphamide")' can be transferred. to- 
non-neoplastic celli especially hematopoietic cells. • V 

"■ . - ' In.another-applfcationi igene thcrajjy veCtor$'are-.used tq'delivera negatiye. 
selection gene to cells of a patient for which selective elimination is desired (e;g^^ 
orcellsofapSthogen). Examples of negative selection genes^includerici^ 
toxin, and HS V thymidine kinase (tk). ' Vectors bearing such genes' can be selectively 
introduced into target cells via a cell surface receptor for which the vector has specific 
afTinity . E^pressidn of the negative selection gene (in the case of HSV tk in the presence of 
ganciclovir) kills cells bearing the gene.' ; . ' ' 

■ ■: ' In another application/gene therapy vectors can be use 

protection in subjects at risk of infection or to treatrsubjects who have al^^^^ 
Such vectors'encode: immunogenic ebitope(s) of pathogenic microorganisms and expi-ess the 
epitopesin the patient, particularly 'in target tissues at primary risk of infection, such as the : 
oral and genital mucosa. \ , ' . ; 



flT Appiication^ nf.Recursive -^n "^"'-'- Recomhihation to Of Pg Ttigrapv 
■ . ' the methods of t^ie invention- Un be used to develop or improve oa 

' and materials used m gene therapy/including animalsycells and vectors for use in /^-v^v«, ex 
wvo and ;« v//ro systems. This section discusses the 'applicatib : - 

• recombination to some specific goals in gene therapy . Many of these gbals relate to _ ; 

improvementsin/vectors used in gene therapy: Unless otherwise indicated the methods are 

. 'applicable toboth viral and nqnviralvectb^^^^^^ . . 

f/VV Imprnved TWTJ H ^''-a' Sector ,^ 

■ . In one embodiment, viruses with improved titers can be developed using the 
recursive recombination methods of the invention. Tlie property of high viral titer can be an 
advantage in propagating large amounts of a virus in vitro for use as an agem in gene therapy. 
This property is also useful if it is desired that the virus replicate in a host tissue, such that 
progeny viruses infect cells surrounding the initially infected cell. Titer of a virus can be 
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improved by recursive sequence recombination. The initial substrates for recombination can 
be viral genomes showing sequence divergence as a result of natural or induced variation. 
The substrates can be whole'penomes or fragments thereof Recombination of fragments is 
useful for large genomes or in situations in which a part of the viral genome is known to be 
particularly important in conferring high titer. The substrates can be recombined in vitro or 
can be introduced into cells and recombined in vivo. Recombination in vivo can be used to 
generate progeny viruses that can be screened directly. However, recombination in vitro 
leads to recombinant genomes or fragments thereof Whole recombinant genomes can be 
packaged into viruses using a packaging cell line or an in vitro packaging system. Fragments 
of genomes are usually first assembled by DNA ligation. They are subsequently inserted into 
a viral genome before packaging. Irrespective of the precise route, one arrives at a population 
of viruses having genomes at least part of which constitutes a recombinant segment. 

The collection of viruses with recombinant genomes can be screened simply 
by propagating the viruses in cell culture for several generations. The viruses with the highest 
titer thereby acquire the highest representation among progeny viruses. If desired, viruses can ' 
be plaque-purified and titers of individual viruses compared to identify the very, best titer of 
viruses from a round of recombination. Alternatively, the viruses can be purified by serial 
dilution to determine the very best titer viruses from a round of recombination. 

The genomes from viruses surviving screening are subject to a further round of 
recombination, which again can be performed in vivo or in vitro. For in vivo recombination, 
viruses having genomes containing the recombinant segments can, for example, be infected 
into a cell at high multiplicity. For in vitro recombination, viral DNA is isolated from viruses 
harboring recombinant DNA. The genomes from viruses surviving screening can be 
recombined with each other or with fresh substrates obtained from similar sources to the 
initial substrates. In some recombination steps, it is desirable to include an excess of 
wildtype version of the viral genome to reduce silent mutations. Again, recombination can be 
performed with whole genomes or fragments thereof Selection is repeated as before. 

After several rounds of recombination and selection, viral mutants, or clones, 
capable of producing the desirable titer can be obtained. For example, without concentration 
of an infected cell culture, it is possible to achieve a concentration of evolved virus of at least 
10^ 10«or 10'^ viruses/ml. 
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(m Tmproved Tnfectivitv of a Virus 

The infectivity of a virus means the percentage of viruses that infect a cell 
when an inoculum of viruses is contacted with an excess of cells. /Obtaining a high infectivity. 
is particularly important with respect to the intended target cell-type. Thus, if a viral vector is 
being used to deliver a beneficial expression product to a target tissue (e.g., lung cells lacking 
a functional endogenous CFTR gene), it is usually desirable that as high a percentage of 
vinises as possible infect that cell type. 

The selection of substrates and means of recombination follows the same 
principles as discussed for improved viral titer. Hpwever, the means of screening viruses 
bearing recombinant genomes is usually different. The previous selection does not 
necessarily select for viruses having high infectivity because high titer can also be conferred 
by high burst size per cell. To screen more specifically for high mfectivity, clonal isolates of 
viruses bearing recombinant segment are used to infect separate cultures of cells. The 
percentage of viruses infecting cells can then be determined by, for example, counlmg;cells 
expressing a marker expressed by the viruses in the course of infection. After several rounds 
bf fecombiriation and screening, viruses harboring recombinant genomes capable or infecting 
50, 75, 95 or 99% of target cells are obtained. . 

rrV Improved Packaging Canacitv of a Virus 

Viruses and vectors with the capability of incorporating increasing amounts of 
recombinant nucleic acids sequences, such as having an improved packaging capacity within 
the viral capsid, can be developed using the recursive recombination methods of .the 
invention. As noted above, the viruses commonly used in gene therapy can package only a 
limited genome length, thus, restricting the capacity of viruses to accommodate large inserts. 
Capacity of a virus can be improved using similar principles to those discussed above. In 
these mexhods, the viral genome to be lengthened should have a site into which increasing ; 
lengths of nucleic acid can be inserted in .successive rounds of screening without affecting . 
other viral functions. Initially, one can start with a viral genome having an insert such that 
the combined length of the genome is close to the existing maximum capacity of the virus. ^ 
The initial substrates for recombination are variant viral genomes as in the other methods. 
The variation usually occurs other than in the length-conferring insert because the insert is 
replaced in actual use of the vector. One source of starting substrates can be viral genomes 
known to show sequence similarity with the virus to be evolved but which have a larger 
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- genome packaging capacity. Recombination proceeds in the same manner as discussed 
above. Viruses having recombinant genomes are then screened for titer or infectivity as 
discussed above. Recombinant genomes from viruses haying the best titer and/or infectivity 
are manipulated to introduce a further insert to increase the genome length. There foiiow 
further cycles of recombination, screening and increasing genome length, until viruses are 
achieved that can accommodate inserts of the desired size. For example, the. maximum insert 
size used in most existing adenoassociated viral vectors is about 5 kb, which can be increased 
to 1 0, 1 5, 20 or 50 kb or more. 

(D) Improved Stability of a Virus 

Viruses with improved stability can be developed using the recursive 
recombination methods of the invention. Stability of a virus for use in gene therapy is 
important both in prolonging the shelf-life of the Virus as a drug between manufacture and 
administration, and in the subsequent ability of the virus to resist cellular degradative 
mechanisms before Teaching its target. The principles for selection of starting substrates and 
performing recombination are the same as in other methods described above. Viruses bearing:, 
recombinant genomes that have evolved to acquire greater stability can be selected by 
exposing the viruses to destabilizing : ^nditions and recovering surviving viruses. For 
example, destabilizing conditions include temperature (hot or cold), mechanical disruption 
(e.g., centrifligation or sonication), exposure to chemicals or exposure to biological degrading 
agents such as proteases (e.g. serum proteases). Viruses surviving exposure to destabilizing ■ 
conditions are identified by propagation of treated viruses and collection of progeny. 
Sometimes, propagation proceeds only for one or a limited number of generations, since 
otherwise progeny viruses become biased toward those having genomes favoring high titer in 
addition to those having genomes conferring stability. 

(E) Improved Expression or Expression Regulation of a Vec tor Coded Sequin ?? 
Improved expression of a gene sequence of interest can be achieved by 
performing the recursive sequence recombination methods of the invention. Usually viral or 
nonviral vectors used in gene therapy encode a product to be expressed in an intended target 
cell.' The product can be a protein or RNA, such as an antisense RNA or RNA that 
specifically binds a target protein, /.e, an aptamer. ., Usually, the coding sequence is operably 
linked to an additional sequence,, such as a regulatory sequence, to ensure its expression, such 
as some or all of the following: an enhancer, a promoter, a signal peptide sequence, an intron 
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and/or a polyadenylation sequence: A desirable goal is to increase the levei of expression of 
fiinctional expression product relative to that achieved with conventiohal vectors.. Expiission 
can effectively be improved by a variety of means, iricludirig increasing ihe rate of production 
of an expression product; decreasing the rateofdegradation of the expression product.or 
improving the-capacity of the expression product to perform its intended function; 

Improved expression of selection marliers can be achieved by perfor^^ 
recursive sequence recombination. For purposes of selection, a gene product expressed from 
a vector is sometimes an easily detected marker rather than a product having an actual; 
therapeutic purpose,e.g., a green fluorescent protein (see Crameri ( 1996) Nature Biotech, . . 
14:3 1 5-3 19) or a cell surface protein. However, sqnie genes having a therapeutic purpose; 
e' g., drug resistance genes, themselves provide a selectable marker; and no additional or 
substitute marker is required. Altematiyely, the gene product can be a fusion protein . . 
comprising any combination of detection and sdection. markers. ■ . . ■ 

■• ■ The substrates forTecombination can be the full-length vectors or fragments 

thereof including coding sequence and/or regulatory sequences to which the coding sequence 
is operably linked. The substrates can include variants o f any of the regulatory and/or coding 
sequence(s) present in the vector. If recombination is effected at the level of fragments, the 
recombinant segments should be reinserted into vectors before screening. If recombination 
proceeds .>7 vitro, vectors contairiing the recombinant segments are usually introduced into 
cells before screening.. Cells containing the recombinant segments can be screened. by 
detecting expression of the gene encoded by the selection marker. Internal reference marker" 
genes can be included on ihe vector to detect and compensate forVariations in copy nurhber 
or insertion-site. For example, if this marker is green fluorescem protein, cells with the 
highest expression levels can be identified by FACS-^^^. If the marker is a cell surface protein, 
the cells are stained with a reagent haying affmityfor the protein, such as antibody, and agaiii 
analyzed by FACStm; . Recombinant segments from the cells showing highest expression are 
used as some or all of the subsu-ates in the next round of screening. 

Evolution of Cytomegalovirus Transcriptional Regulatory Elements 

The rnajor immediate-early (IE) region transcriptional regulatory elements, 

including promoter and enhancer sequences (the promoter/enhancer region), of 
cytomegalovirus (CMV) is widely used for regulating transcription in vectors used for gene 
therapy because it is highly active in a broad range of cell types. Optimized CMV 
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transcriptional-regulatory eJements which direct increased levels of transgene expression is 
generated by the recursive recombination methods of the invention, resulting in improved 
efficacy of gene therapy. As the CMV promoter and enhancer is active in human and animal 
cells, the improved CMV promoter/enhancer elements are used to express foreign genes both 
in animal models and in clinical applications. 

A library of chimeric transcriptional regulatory elements is created by DNA 
shuffling of wild-type sequences from five related strains of CMV. The promoter, enhancer 
and first intron sequences of the IE region are obtained by PCR from the CMV strains: human 
VR-538 strain AD169(Rowe (19.56) /'roc. Soc. Exp. Biol.Med. 92:418; human V-977 strain 
Towne (Plotkin (1975) Infect. Immunol. 12:521-527); rhesus VR-677 strain 68-1 (Asher 
(1969) Bacteriol. Proc. 269:91); vervet VR-706 strain CSG (Black (1963) Proc. Soc. Exp. 
Biol. Med. 1 12:601); and, squirrel monkey VR.I398 strain SqSHV (Rangan (1980) Lab. 
Animal Sci. 30:532). The promoter/ enhancer sequences of the human CMV strains are 95% 
homologous, and share 70% homology with the sequences of the monkey isolates, allowing 
the use of family shuffling to generate a library great diversity. Following shuffling, the 
library is cloned into a plasmid backbone and used to direct transcription of a marker gene in 
mammalian cells. An imemal marker under the control of a native promoter can be included 
in the plasmid vector. Expression markers, such as green fluorescent protein (GFP) and 
CD86 (also known as B7.2, see Freeman (1993)7, Exp. Med. 178:2185, Chen (1994) J. 
Immunol. 152:4929) can also be used. In addition, transfeclion of SV40 T antigen- 
transformed cells can be used to amplify a vector which contains an SV40 origin of 
replication. The transfected cells are screened by FACS sorting to identify those which 
express high levels of the marker gene, normalized against the internal marker to account for 
differences in vector copy numbers per cell! If desired, vectors carrying optimal, recursively 
recombined promoter sequences are recovered and subjected, to further cycles of shuffling and 
selection. 

if^ I mproved Activity of Pnig Resistance F n zvmes and F.vpre.ssion of Hrup 
Resistance - Sequence-; 

The recursive recombination methods of the invention also provide for means 
to improve the expression of drug resistance sequences/ proteins. Many treatment regimes 
entail administration of drugs having side-effects on a particular cell type in the body. For 
example, chemotherapy is notorious for killing cells. other than the targeted cancer cells. See 
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UcW995) Cylokines & Molecular Therapy 1:1 1-20. Myelosuppression, or bone marrow 
toxicity, is dose-limiting for many chemotherapeutic agents. This is not only a dangerous 
side effect but also limits the effectiveness of chemotherapy. Indeed, the chemotherapy can 
be fatal, either directly by loss of blood cell function or indirectly by causing secondary ; 
cancers such as leukemia. It is possible to protect hematopoietic cells by delivering drug . ; 
resistance proteins via gene therapy. This principle has been demonstrated by a number of 
studies in which murine :bone manow cells were protected against chemotherapeutic 
alkylating agents by the dverexpression of a protective alkyltransferase. Other drug resistance 
■proteins can be used for chemoprotection of normal tissues and can be targets for improved , 
expression using the methods of the invention. They include, for example, glutathione-S- 
transferase, dihydrofolate reductase and superoxide dismutase. 

■ ' Alkylating agents are especially toxic to the hematopoietic system, with 
. myelosuppression being the dose-limiting side effect. Hematopoietic cells are so susceptible 
to alkylating agents that iatrogenic leukemias are a common occurreiice. Aikylation therapy- 
can also cause severe pulmonary toxicity. This l.imiiation sensitivity can be attributed to the 
low expression of the DN A repair protein O'-methylguanine-DN A methyltransferase in . 
hematopoietic'cells (also called 0*-alkylguanine-DNA alkyltransferase, MGMT or . 
alkyltransfbrase; EC 2.1.1.63). Alkylating agents, especially nitrosoureas, as used either 
alone or in combination with other drugs to treat many types of cancer, such as Hodgkin's and 
noh-Hodgkins lymphomas, multiple myeloma, malignant melanoma, brain neoplasms, 
gastrointestinal cancers and lung cancers.' Together. these cases constitute over one third of all 
cancers diagnosed. Thus, improving the effectiveness and decreasing the toM . 
alkylation-based chemotherapeutic regimens will have a profound impact on health care. 

The introduction of drug-resistance genes, as MGMT, into bone marrow.stem 
cells or ptiimonary cells via'gene therapy.is one way to overcome therapeutic limitations 
created by the,tdxicity of chetnotherapeutic agents^ is\o transduce a patient's . 

hematopoietic precursors or pulmonary^ cells ex vivo and repopulate the bone marrow with 
these cells before or after chemotherapy. See generally Dunbar (1996) Annu: Rev. Med. 47, 
11-20. Bone marrow is a relatively easy tissue to extract, manipulate and reintroduce into the 
body. Pulmonary cells can also be extracted. manipulated and then reintroduced by an 
inhalation delivery system. Alternatively, genes can be directly introduced into pulmonary 
cells in vivo by use of spray delivery systems, such as aerosol delivery of transgenes using 
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' cationic lipids,' as 'described in Eastman (1997) Hum. Gene Ther i:16S-113\ Lee (1996) 
Hum. Gene Ther. 7:1701-1717; or aerosol delivery of adenoviral vectors, see McDonald 
(l997) Hum, Gene Ther. S:4\ \'422 : - . 

MGMT is found in all organisms examined, prevents the mutagenic, cytotoxic, . 
and carcinogenic effects of chemotherapeutic alkylating agents. MGMT removes al^yl, 
groups attached by such chemicals from the 0' position of guanine. These alkyl groups, are 
transferred irreversibly to a cysteine in the active site of the MGMT.protein, inactivating the 
alkyltransferase. Thus, the enzyme is a suicide enzyme and can act only stoichiomelricaily, 
which is an important barrier to improvement of MGMT. Because each protein module acts 
only once in a suicidal manner. the protection afforded a cell is determined not only by the 
activity (quality) of the MGMT but also by the number of MGMT molecules. Cells, such as 
bone marrow'cells, which express little or no alkyltransferase are very sensitive to laboratory 
alkylating agents such as N-methyl-N'-nitro-N-nitrosoguanidine (MNNG) (Day (1980) Na/ure 
288:724-727) and clinically used. nitrosoureas (Erickson'( 1 980) Na/ure 288:727-729). Thus, 
myelosupprcssion is a serious problem with alkylation-based chemotherapeutic rcgime:ts 
(DeVita (1993) Cancer: Principles and Practice of Oncology), but it has been overcome in 
experiments in which the wild-type human, mouse, or bacterial alkyltransferase genes v/ere 
transduced into human and mouse hematopoietic cells. The overexpressed genes, carried on 
retroviral vectors, protected stem cells in culture from killing by nitrosoureas (Allay (1995) 
B/ooc('85:3342-3351; Mont2("l995) Cancer 55:2608-26 14). Furthermore, when these ^ 
cells were transplanted into the bone marrow of mice, the protection proved to be long- lasting 
in vivo (Maze ( 1 996) Proc. Nail. Acad: ScL USA 93 :206-2 1 0). Similar effects were seen 
when liver and thymus rather than bone marrow were targeted (Dumenco (1993) Science 
259:2 1 9-222; and Nakatsuru ( 1993) Proc. Nati Acad Scl USA 90:6468-6472). 

This protective. effect of MGMT can be improved by recursive sequence 
recombination in several respects. Firsts novel variants can be selected having higher specific 
activity, /.e-, faster repair of cytotoxic alkylalion-induced lesions. Thus, for a given 
expression level, bone marrow cells will be better protected. Some improvement in MGMT 
has been reported (Christians (1996) Proc. Natl Acad. Scl USA 93:6124-6128) using a 
conventional cassette mutagenesis. Second, novel variants can be selected for resistance to 
inhibitors of wild-type alky Itransferases, such as O^-benzyl guanine. Such inhibitors are 
sometimes used to suppress endogenous alkyltransferases present in cancer cells (Pegg (1995) 
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: Progress in Nucleic Acid Res. and Molec: BioL S 1 : 167-223). ..Inhibitor-resistant MGMT can 
be Used to traijsfect bone marrow in ti-eatment protocols in which.alkylating agents are,. ■ 
combined with inhibitors of alkyitransferases. Third, novel, variants of the coding sec|uence 
^nd/or operably lirikedreguiatory'--^ selected for improved expression of ., 

*MOMt. FourtH, variants'df MGMT can be produced that bind to but dp not remove alkyl 
adducts from DNA, effectively resulting- in DNA-protein crosslinks more.toxic to the cell 
thak the alkyi adducts alone, ^)ectors expressing the mutant variants canhe4argeted to cancer 
cells before treatment with the alkylating substrate. Fifth, MGMT variants can be selected to / 
protect mammalian cells against the clinically relevant nitrosoureas.:. .For this,.purppse.: ;,: . ,. 
selection should be preferably performed in' mammalian cells rather than bacterial cells, 
because the protective effect 'of MGMT against niti-osourcas is stronger in the fbrttier.. At 
least the final round of selection is usually performed in siem cells because some ofthe - , 
•component factor contributing to the'end point of drug-resistance may be.cell-type dependent. 

■ \ Because expressioa levels are important;fbr.the protective ef^ 
vector sequences other than that encoding MGMT proyides an.important^source of . , . • 
improvement. The vectors are selected based on desired'cndpoints, such as the ability to . 
protect cells from'alkylating agents, . The endpoint is achieved by.a variety and a cqmbination 
. of components too complicated to predict, including enhanced transductidn,better vector . ;;. 
^stability, and improved transcription of the gene in addition to impibyed MGMT activity. . , 
■ • The sometimes-low transfection efficiencies of gene therapy are not a major 

limitation in ex vrv« methods because alky lation treatment effectively serves as a positive, ' 
selection for transfected cells. In contrast, lovy transfection efficiencies can be a problem in m 
vivo gene replacement therapy because there is no generally positive select.pn, only negative 
selection by tumbricidal gene therapy. Improved mea^^ 

replacement therapy allows,: fbr^example. atelatively smalbhiimber of chemoresistant. - 
• hematopoietic cells'torepopulatethe45oneman;ow.' ,, . . 

• A drug-resistance gene is a starting materialTpr improvement using the 

methods ofthe ihventioh is the multi-drug resistance gcncMDR-\. MDR-\ encodes a plasma 
membrane glycoprotein called "P-glycoprotein (Pgp") which acts as an:ATP-dependent drug- 
efflux pump and confers chemoresistance to a. wide variety of drugs (Chin (1993) /Ic/v. 

Cancer Res. 6^:\51). Cells not expressing WDR- 1 are exquisitely sensitive to drugs such as 
vincristine, etoposide, and colchicine. This same chemoresistance property, which when 
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expressed by tumor cells can frustrate chemotherapy efforts, can be turned to an advantage 
when used as a positive selectable marker. Metz ( 1 996) Virology 2 1 7:230-24 1 , reported a 
20-fold higher stringency when selecting for MDRl expression compared to neo selection. 
P-glycoprotein has been' demonstrated to positively select for transformed cells in the in vitro 
correction of cells from at least two different genetic diseases, Fabry disease (Sugimoto 
(1995) Human Gene Therapy 6:905-915) and chronic granulomatous disease (Sokolic (1996) 
Blood SI 'A2-50), However, there is no reason to believe that nature has optimized MDRl for 
activity against man-made drugs. Improving MDR-l by recursive recombination to improve 
protection of cells from drugs such as etoposide and colchicine will allow the use of higher 
levels of such selective agents," which will increase the selectionstringency and better 
differentiate between transformed and non-transformed- ceils. • 

MDR'\ is improved/modified by DNA shuffling followed by positive selection 
in mammalian cells. 'Randomly mutated pools of A/D/?-! are inserted into appropriate vectors 
'{e.g., retroviral, adenoviral vectors) and transformed into drug-sensitive cells. Selection w^ith. 
colchicine and/or etoposide and/or vincristine will identify active MDR-] variants. The 
MORA genes are rescued from surviving cells and subjected to additional rounds of 
recombination and selection wiih increasing doses of drugs. 

Because some mammalian cells already express high levels of P-glycoprotein, 
it might be difficult to determine whether the improved yV/D/f-1 transgene is expressed in 
these cells; i.e., the background will be high. In this case the endogenous P-glycoprotein is 
inactivated with a well-characterized inhibitor such as verapamil, and transform with a 
marker MDR-l transgene that encodes a mutant P-glycoprotein, resistant to the inhibitor yet 
highly active against the cytotoxic drug. Such a variant is created by selecting MDR-\ mutant 
pools in the presence of both the inhibitor and the cytotoxic drug(s), such as colchicine. For 
example, the methods of the invention'are used to create alky transferase mutants super-active 
against the cytotoxic chemical N-methyl- N-nitro- N-nitrosoguanidine (MNNG) and 
completely resistant to the alky [transferase inhibitor benzylguanine. 

MDRA thus' optimized as a positive seilection marker is inserted into the vector 
of choice. The vector can also be optimized by DNA shuffling, either by itself or in 
combination with MDR\ mutagenesis {MDRl and the vector shuffled as a unit). Shuffling 
the entire construct allows many parameters to be tested at once. Bicistronic.arrays, 2 genes 
transcribed as one mRNA from the same promoter but translated from separate ribosome 
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binding sites, can be used (Sugimoto ( 1995) Human Gene Therr supra).. Shuffling the entire 
array or the whple.construct can be used to optimize secondary structure of the bicistronic 
mRNA to improve translation of the second; downstream gene. For example, a bicistronic 
retroviral vector encoding MDRr and a, gene complementing a genetic defect can be 
- constructed and optimized using the methods of the invention. The entire.veetor can be " . 
'mutagenized by DNA shuffling and reassembled. Additionally, the vector can be packaged as 
virus by a packaging cell4ine, transfected into the defective cells, and selected with 
colchicine: Selection is effected by analyzing' surviving.ceils for complementation of the 

genetic defect. . • . . , ' ' . . v . - ; - - a : ; . 

MZJyf- l ean be evolved/modified not only to confer ^ 

drugs it already recognizes {e.g., etoposide) hut also to confer protection against drugs nor 

recognized by wildtype MDR-W such as alkylating agents; For, example, an MDR-\ gene can 

be modified by recursive recombinaiion (evolved) to pump alkylating agent-out of a cell, thus 

serving as a complement to MGMT (described above).,' For example, both the MGMT and; 

AiDR'l gene's can be transduced into stem cells before'combination chemotherapy m which - 

one of the' drugs' is ah alkylator. Studiesin which stem celis were transduced with the wild- 

• type MDR-1 gene gave results similar to: those cited above with MGMT foralkylating agents - 

^Son-entino (1992) 5dence 257:99V ^ / * - " - ; - , 

■ Another suitable gene for evolution/ modification using^^ 

invention is giutathione-S-trahsferase^which detoxifies alkyla^ the cytoplasm, 

complementing MGMT, U acts on drugs afterthey have entered the nucleus and damaged ' 

DNA. Some improvements in glutathione-S-transferase resulting from conventional cassette 

mutagenesis in bacteria have already been reported (Gulick ( 1 995) Prod Natl ^cad ScL 

'USA 92:8140-8 144). ' Further evolution by recursive sequence recombination will provide 

additiorval improvemenSs: The improvement gene can then be transfected into stem cells or 

; lungxeHs.on,its ovVn or in cpiT| ^ . , . _ ]\ [ J /: 

' Otherdrug-resistance genes are candidates for evolution fo 

side effects in other tissues. For example, bleomycin is an antineoplastic whose major 
toxicity is to pulmonary cells. The protein bleomycin hydrolase can protect cells from 
bleomycin, and the human gene was recently cloned (Bromme {\996} Biochemistry 35:6706- 
714). The gene can be improved by gene shuffling and used to protect pulmonary cells in 
'cancer patients. 
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In other embodiments, candidate genes for improvement include the genes 
encoding DNA ligase and topoisomerase to protect against ionizing radiation (Boothman 
( 1 994) Cancer Res. 54:46 1 8-4626), genes encoding nucleotide excision repair enzymes such 
as T4 endonuclease V to protect against UV irradiation and skin cancer, and genes encoding 
alkal ine phosphatase endonuclease and glycosylases to improve the base excision repair 
pathway which is crucial to ward off the effects of oxidative DNA lesions thought to cause 
many types of cancer and accelerated aging. 

Evolution/modification of drug-resistance genes and associated regulatory 
sequences using the methods of the invention falls under the general approach discussed 
above for improving gene expression. However, in evolving drug-resistance genes, it is 
sometimes desirable not only to improve expression of the gene but to increase the degree of 
resistance conferred by the gene product with a particular drug. In this situation, it is 
preferable that substrates for recombination include the drug-resistance gene as well as 
associated regulatory sequences so that the resistance gene can be evolved within the genetic 
context in which it is to be expressed. Diversity between the initial substrates can be the 
result of induced mutations, natural drug-resistance- genes from different sources, and 
mutations already known to confer irr.^roved properties. 

For example, the cDNA sequences of five different mammalian species of 
MGMT (human, rat, mouse, hamster, and rabbit) have been reported, and, despite very 
extensive homology, variations do exist, as illustrated in Figure 4. Following is ah alignment 
showing the human anaino acid sequence on the top line with other amino acid sequence 
found in nature shown below the human sequence. 

M PKPCEMKRT TLDSPLGKLE LSGCEQGL-HE IKLr.GKGTSA ADAVF.VPAPA ^() 
AETKL YSVFH AM R G RFPSGK FN T FT A TP 
IDKA I S SSKC 

. ■ ■ E S ... 

AVLGGPF.Pr.M OCTAWI.NAYF HOPEAIEEF. P VPAI.HHPVFO OESFTROVI.W 100 
EL S V ET E RE A TPGL L E ' . ' 
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5 N : A CD TT.P.,G;,- . ■ . . 

SPPAGRN 207 . : " 

PELS - ■ 

K - " 

' ° ■ The natural variations can be incorporated by any of the formats discussed in 

Section II to generate recombinant forms of MGMT including natural segments unique to 
' human ar^dnonhuman forms, as discussed in Example 3: For example, oligonucleofdes can 
be designed to encode all the different combinations of natural variants, and these ; 
',5 oligonucleotides will be m.xed m with the fragmented wild-type human gene. A surpr.smgly 
.. smallnumberofoHgonuc.eotides(twenty-one)canbeuscdiftheyaredegenerateatposmons 

. at which more than two amino aads are represented in natureisee Figure 3). The . 
■ . oligonudeotides shown tnngure 3 contam up to twenty one ^^^^^^ 

■ hun,an sequence flanked on either s>de by a 20.base sequence perfectly matched wuh the 
20 ■ 'human MGMT sequence. Another use of "oligo spiking" is to b,as shuffled gene pools 

' ^ ,owardknowndesirablemutationssuchasthe.Vl39Fmutationdemonstratedto,mprovethe , 

■ ^ wUd-typeprotein(Chr.st.ans(1996)f.o^,A^r.^ 

conferring O'-benzylguanine resistance. , ., . 

An alternative to "oligo spiking" is to obtain all the individual cDN AS and • 

25 shufnc them together. This option might have some tendency to dilute the human character ' 
of the pool leading to immunogenic problems when used in human gene therapy, but this 
problem can be overcome by backcrossing mutants with the wild-type human gene to 

eliminate non-useful mutations. 

Recombined drug-resistance genes and vectors encoding them can- readily be 

""Vd'-^ screened for improved expression. Cells containing vectors^ 

are exposed to the drug and surviving cells recovered. These cells are enriched for 

recombinant segments conferring improved resistance to the drug. Screening can be made 
more stringent in successive rounds by increasing the concentration of drug or duration of 
exposure thereto. 
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fGlEvQlution of Transducing Vectors for Integration a nd Stable-Hx pref^^j^ p jn 
MamTOlian Stem Cdl§ > 

Vectors having new and/or improved ability to infect, integrate atnd express 
themselves in hernatopoietic stem ceils can be developed using the recursive recombination 
methods of the invention. -A major goal in gene therapy is lo develop practical methods to 
efficiently integrate DNA constructs into human stem cells. A practical method- for 
efficiently integrating retroviruses into stem cells allows repopulation of patients with 
autologous bone marrow that had been genetically modified with traits of interest. For 
example, the stem cells are engineered to express trans-dominant factors that interfere with 
viral replication. Stem cells are engineered to express wild type or engineered transgenes that 
complement a defined genetic defect, such as Gaucher's disease. MDR genes or 
alkyltransferase genes are inserted into stem cells to confer resistance to chemotherapeutic 
agents. Gene encoding T cell receptors specific for cancer or pathogen epitopes of interest 
are inserted for expression upon maturation of the stem cell. 

However, stem cells are difficult to purify and rapidly lose their piuripotent 
phenotype if propagated in vitro.. Retroviruses are ver>' inefficient at integrating into 
nondividing cells in general, and stem cells in particular. Thus, recursive. recombination is 
used to evolve a factor or set of factors that, upon infection with and expression of the 
retrovirus genes prior to integration, can transiently or permanently render a stem cell 
susceptible to retroviral integration while at the same time remaining piuripotent: 

In one embodiment, large {>1G^) libraries of retroviruses expressing candidate 
factors for transiently perturbing stem cells so as to promote retroviral integration are made. 
Such factors include, but.not be limited to: HIV matrix, HIV vpr, random fragments of HIV 
and other lenti viruses (the only class of retroviruses able to efficiently transduce npn-divi(jihg 
ceils); cDNAs from stem cells; cDNA from stromal cell cultures (which make factors that 
influence the differentiation state of stem cells, and over production or evolution of 
recombined forms exert the desired effect); or^ any other cytokine or growth factor. Such 
libraries are used in-the in vitro and in vivo recursive recombination methods of the invention, 
as generally described above, to create a retrovirus which can efficiently infect, integrate and 
express sequences and proteins of interest in non-dividing stem ceils. 

Another embodiment repopulates SGID or SCID/NOD mice with human stem 
ceils that have been transduced by a retrovirus modified by the above methods. Progeny of 
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retroviruses frorh stem cells that ^ere successfully transduced by a member of the initial 
retrovirus recombinant segment library are recovered. ' Selection markers,, such as green 
fluorescent protein (GFP). drug markers,. or cell sorter. (FACS) markers may be encoded tn 
the transducing retrovirus to facilitate recovery of repopulating stem cells transduced w.th a 

• retrovirus construct. Sequences encoding the factors to be evolved/modified or the entire . . 
mtegrated retro viral genome can be recovered. Further rounds of recursive sequence 
recombination can be repeated until the desired efficiency/efficiency of stem ceik 

is achieved. : . - ^ 

, A murine SCID/NOD immunodeficient system that can be repop^ 

primitive humah.hematopoieticstemxells cah be used (Dick (1996)'CSH Gene Therapy ' " 
abstract #n) Retroviruses can infect these stem cells with very low but detectable . .. 
efficiency. Progenitor cells with integrated retrovirus can be recovered from. peripheral blood 
cells in this SGID/NOD repopulat.ion model. This and analogous repopulat.on systems, 
therefore forms the bas:s for selecting retroviruses wuh improved efficiency of integration 
into primitive pluripotent cells. As noted above, including GFP m the vector allows for 
FACS purification of cells eKpressing retroviral-encoded.proteins after repopulat.on. If the 
repopulation is initially very linefllcient. a selectable gene such-as.Neaor TK to sdectiveiy 

culture transduced cells is also expressed. 

In another embodiment, rather than removing infected stem cells and isolating 

retroviral sequences for-furtherrounds of recursive recombinat.on; iethally irradiated 
' retroviral producing helpef Imes containing^recombinant sequences are injected into the ■ 

SCID/NOD bone marrow. ' With this techniq.ae, recursive recombination takes placem v.v.; 
. the '^tem cells remaining in the special environment of the bone marrow, .n environment that 
. may prove impossible to mimic iwvi/ro. . . . . 

■ y - ' in a further embodiment, recursive-recombination is used to develop a:means, 

by; which viruses which cannot nomtaliyT.ck the means to integrate mto non-dividing cells. 
■ This method incorporates HIV proteins which ar. required for HIV to integrate mto: 

• nondividing cells, into other vectors of interest. For example, integrase, the enzyme : 
responsible for mediating the integration of the viral genome in the host cell chromosome, 
can suffice to connect the HIV-I preintegration complex with the cell nuclear import 
machinery. Viral matrix and Vpr proteins also play importaHt roles in the ability of HIV to 

. integrate into non:divlding cells. See Gallay (1997) Proc. Natl. Acad. Sci. USA 94-.9825- 
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, 9830. Repeated cycles of recursive recombination, as DNA shuffling, are carried out until the 
desired property is conferred to the vector or sequence of interest. 

In another embodiment, before recursive recombination, long term bone 
marrow cultures are stimulated to cycle in vitro. This results in increased retroviral 
transduction of the stem cells in both a murine SCID/Beige repopulation assay (Agatsuma 
(1997) Antiviral i?^^. 34:121-130) and in stem cell repopulation of terminal human myeloma - 
patients with transduced bone marrow cells. Cycling stem cells are more susceptible to 
transduction. Thus, stem cells can be stimulated such that they are^more susceptible to 
retroviral transduction and yet remain pluripotent. 

IH) Improved Tissue S pecificity of a Vector 

Vectors with new and/or improved tissue specificity (tissue tropism) can be 
developed using the recursive recombination methods of the invention. In most gene therapy 
applications, it is desirable thai the gene therapy vector be delivered with a high degree of 
specificity to a particular tissue type. Specificity of cellular targeting is a key issue impacting 
the safety and practicality of these vectors for in vivo gene therapy. Thus, there is a need to 
restrict and/or redirect the specificity of gene therapy vectors, such as adenovirus. 

One example iliustra.;..g the need to deliver a gene therapy vector a specific 
tissue type involves delivering a wildtype GFTR gene to cystic fibrosis patients. The CFTR 
gene should be delivered mainly to pulmonary tissue. In a second example, where the gene 
therapy vector encodes a chemotherapeutic agent, it is desirable that the agent be delivered to 
neoplastic cells and not normal cells. 

The strategy in selecting substrates and recombination formats is in general 
similar to those discussed before. Substrates for recombination can be whole viral genomes 
or can be fragments encoding the viral proteins thought to interact with cellular receptors. If 
such fragments are recombined, the recombination products should be reinserted into viral 
genomes, and the genomes packaged to form viruses before screening. 

For example, for evolution of vesicular stomatitis virus (VSV) to infect new 
target cells, recursive recombination should focus on G-protein sequences, because the G 
protein is expressed on the capsid's outer surface (Schnell (1996) Proc. Natl Acad. Sci. USA 
93: 1 1359-1 1365). Furthermore, it has been technically difficult to generate viruses encoding 
the vesicular stomatitis virus G-protein (VSV-G) because it is too toxic to the host cells to 
allow for viral propagation (Yoshida (1997) Biochem. Biophys. Res. Gommun. 232:379-382). 
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Thus, the methods of the invention can be used to generate modified VSVG prptein,vthereby 
generating new target cells for recombinant VSV. ... 

There is also a need to generate tissue-specific adenoviruses. Since the 
tropism of adenovirus is nonselective, tissue-specific expression of systemically administered 
vectors can only be achieved by the use of a tissue-specific promoter/enhancer that is small 
enough to fit the insert capacity of the vector. AJtematively, tissue-specific expression is 
generated by ablating the native promiscuous tropism of adenovirus and constructing new 
tissue-specific domains :using the methods of the invention. Generation of tissue-specific * [ 
adenoviruses by recursive sequence recombination overcomes this non-selectiyc tropism . 
limitation of native adenovirus in the use of the vector in gene therapy. 

' \ Adenovirus binds to eukaryotic cells using a "fiber protein" which protrudes 
from each of the twelve vertices'of its icosahedral capsid. Serological and mutagenesis 
studies make it clear that the fiber, a homotrimer consisting of "staff and "knob" 'domains, 
interacts with cellular receptors. .The structure of the knob has been reported by Xia (199^) 
Structure 2:1259-1270.. R. D: Gerard has used the structure of theheterotrimeric knob to scan 
this structure by SDM for mutations that reduce binding to the receptor (personal 
.communication, 1996 CSH Gene Therapy meetirig These studies-allow construction of ■ 
mutants with abrogated or severely reduced ability to infect using the natural receptor, which 
is known to be expressed in many tissue types. This is a starting point from* whicH to evolve, 
i. e., use'the recursive sequence recombination methods of the invention; new tissue 
specificities for the adenovirusTibcrs which bind to cellular receptors. V.. Legrand (CSH . 
poster 1 84) and Dan von Seggery (CSH poster #223) have reported systems for expressing 
mutants of the fiber protein off of a small easily manipulated SV40 based vector. These 
constructs will support plaque formation by an adenovirus deleted for the fiber gene. Legrand 
used this system to fuse the 1 1 amino acid Gastrin Releasing Peptide (GRP) to the C-terminus 
of the fiber gene^ LacZ-f- adenoviral mutants expressing this; fiasion protein were able to infect 
cells expressing GRP receptor is a:manner that was only- 60% inhibitable by soluble knob 
protein (CSH poster 184), whereas viruses expressing the wild type protein are about 90% 
inhibitable. This was given as evidence that the interaction of GRP with its receptor is 
supporting infection of the host cells. 

. In one illustrative embodiment, to improve this adenovirus system using the 
methods of the invention, a mutant fiber protein or a domain replacing the knob that has lost 
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the ability to . bind its native receptor is generated. Generation of evolved fiber sequences by 
recursive recombination generates a new adenovirus fiber qr'knob-associated ligand with a 
new specificity. Alternatively, libraries of mutant sequences can be inserted onto the 
C-terminus of the knob in a manner analogous to the GRP construct described above.. 
Libraries of potential ligands can be randomly inserted throughout the "staff and/or "knob" 
domain. The entire knob. can be randomly mutagenized and selected for infection of desired, 
targets. Other exposed yiral proteins, such as penton or hexori proteins, can be modified with 
libraries of insertion mutants. Libraries encoding,short protein sequences can be inserted in 
to adenovirus hexon protein and expressed on the surface of the adenovirus virion as part of 
the hexon (Crompton (1994) J Gen. ViroL 75: 133- 139). Next, these modified viruses 
comprising the recursively evolved viral proteins are used to infect target cells. Diversity and 
modifications in yiral protein affecting adenovirus tropism are selected for by plaque 
formation, or by eel! soning (FACS), which can be based on transient expression of a reponer 
gene such as GFP. 

Interaciion of the fiber penton protein with an integrin on the target cell 
surface, the alpha-v-integrin, provides a celi-virus stabilizing interaction (it is knov/n that onc^ 
cannot totally inhibit adenovirus infection with soluble fiber knob protein). In the absence of 
fiber penton-cell integrin interaction, there is a lower level of viral infectivity. As a result of 
this complexity in the mechanism which determines the cell specificity of adenovirus, the 
methods of the invention are used to coevolve multiple genes or domains on adenovirus 
which interact with their cognate receptors on target cells, such as the penton fiber domain ^ 
which interacts with target cell alpha-v-integrin. Consequently, recursive sequence 
recombination of chosen viral genes, or of the whole virus, is a particularly useful tool with 
which to rapidly evolve tissue-specific adenovirus. - . . . 

In another illustrative embodiment, the highly developed Ml 3 technology is 
used to evolve peptide ligands for receptors of interest on target cells. Standard phage display 
library technology is used to screen for peptide ligands capable of binding purified receptor. 
Alternatively, the libraries can be screened by panning against, cells. The affinity of these 
ligands is rapidly evolved in Ml 3. Pools of evolved ligands are then engrafted onto target 
sites on adenovirus, for example, C-terminal fusions to fiber protein. This couples the power 
of M13 selection to the adenovirus system, making it possible to make libraries of the size 
that could not be made with Ml 3 alone. 
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Sc'reening is accomplished byvcontacting vimses contaim^ 
s^gments'with a first. population of cells for which infection by the virus is desired.and-a. 
second population of cells, for which infection is not desired; Viral genomes recovered from 
the first pppulatioh of cells are enriched for recombinant segments conferring specificity for 

' that cell type, the firsf and/second populatiohs'of cells can be present in in 
an organism; "For example,' one can infect a whole organism with the virus and recover 
recombinant segments from a subset of blood cells (this being ^the cell type % which :; 
infection is desired). Alternatively, one can infect a vyhole organism, including humans; 

' suffering from a^natural or jnduced cancer-with virus and recover recombinant segments from 
the'cancer cells.' In a further variation; the first and second population of cells are co- ■ 
cultivated with the virus in mixedeell culture. The twp'celUypes, if they are not. readily. , , 
distinguishable by microscopic exammatiori. can be distinguished by expression of a marker, 
•such as green fluorescent-protein or cell.surface receptor in one cell type. In the initial round ' 
of screening, the existing host cells are usually present in excess (e.g.; a ratio of 90% existing 
host cells to 10% desired target cells) The proporiioh of desired target cells can be .increased 

in successive rounds. . , . 

• " " The recdmbinant segments, recovered from, the population of cells for' whi^ 

infection is desired are used as substrates in the next round of recombination. . Subsequent 
rounds ofscreening are perforiried by the same principles. / r'..< ';^^-'.. -'^^ 

• ' • In a variation of the-above approach, a eukaryotic or bacterial virus ismodified 
to have specificity for a given cell' type by expressing a ligand as a.fusion protein with'a virai: 
coat Firotein on the viruses outer surface. The ligand is chosen to have affinity for a receptor ' 
'known to be present on the cell type of interest. For example, the EOF family of protein.s 
:encompasses several polypeptides such as epidermal growth factor , (EGI:),' transforming 

growth ^^ctor alpha (TGF alpha); amphiregulin (AR) and hereguHn (HRG-beta 1 which: 
.;reguiate.proliferation in breast cancer^ with membrane receptors; ^ 

Han (1 995^ Proc. Nad. Acad Scil.USA 92:9747-975i,;;reported that Moloney murine 

leukemia virus can be modified to express human heregulin fused to gp70, andthe ; 

recombinant virus infects certain human breast cancer cells expressing human epidermal 

growth factor receptor. 

This principle can be extended to other pairs of virus expressing a ligand 

ftision protein and target cell expressing a receptor. For example, filamentous phage can be 
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engineered to display antibody fragments (e.g., Fab or Fv) having specific binding affinity for 
virtually any chosen cellular receptor. Binding specificity, of ligand to receptor can be ' . 
optimized by recursive' recombination of the segrnent of the viral genome encoding the 
ligand, and screening using first and second populations of cells as discussed above. 

Although viral vectors are most amenable to evolution/recursive 
recombination to acquire new or altered tissue specificity, the same principles can be applied 
to nonviral vectors. Such vectors can be engineered to contain specific uptake sequences 
thought to favor uptake by specific target cells. Alternatively, variants of nonviral vectors can 
be recombined without prior knowledge of sequences that might mediate uptake. For 
example, the starting substrates can be random sequences. Recombination products are 
contacted with first and second populations of cells as described above under similar 
conditions to those contemplated for use of the vector.: For example, if a vector is to be used 
packaged in liposomes, screening is performed with vectors containing recombinant segments 
packaged as liposomes. Again, vectors containing recombinant segments are recovered from., 
the population of target cells and these segments are used in the next round of recombination., t 
(11 Improved Uotakeof DNA Mediated bv Fvolved DNA Binding Prntein^^ 

The efficiency and specificity of uptake of vector nucleic acid uptake by a s 
given cell type can be improved by coating the vector with an evolved/recursively 
recombined and modified protein that binds to the nucleic acid. The vector can be contacted 
with the modified protein in vitro ox in vivo. In the latter situation, the protein is expressed inK- 
cells containing the vector, optionally from a coding sequence within the vector. The nucleic^f 
acid binding proteins to be evolved usually have nucleic acid binding activity but do not , 
necessarily have any known capacity to enhance or alter nucleic acid DNA uptake. . 

In this embodiment, DNA binding. proteins that are modified by the methods 
of the invention include transcriptional regulators, enzymes involved in DNA replication 
(e.g., recA) and recombination, and proteins that serve structural functions on DNA (eg., 
histones, protamines). Other DNA binding proteins can include the phage 434 repressor, the 
lambda phage cl and cro repressors, the £. coli CAP protein, myc, proteins with.leucine • 
zippers and DNA binding basic domains such as fosand jun; proteins with !POU' domains 
such as the Drosophila paired protein; proteins with domains whose structures depend on 
metal ion chelation such as Cys2His2 zinc fingers found in TFIIIA, Zn2(Cys)^ clusters such as 
those found in yeast Ga/4, the CySjHis box found in retroviral nucleocapsid proteins, and the 
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Zn,(Cys), clusters found in nuclear hormone receptor4ype proteins; tiie phage P22 Arc and 
Mnt repressors (jceKjiight (1989)/ £/o/. C/jem. 264:3639-3642; Bowie (1^ . 
Chem. 264:7596-7602). RNA binding proteins are reviewed by Burd (1994) Science 
265:61 5r62 1, and include HIV Tat and Rev, ■ 
5 ■ As in other embodiment of the invention, evolution of DNA binding proteins 

toward'acquisition of improved or altered uptake efficiency is effected by recursive cycles of 
recombination and screening. The starting substrates can be nucleic acid segments encoding 
natural or induced variants of one or nucleic acid bind ing proteins, such as those mentioned 
above. The nucleic acid segments can be present in vectors or in isolated form for the 
Id- recombination step. Recombination can proceed through any of the formats described in , 

■ Section II. ' ■ 

For screening purposes, the rccombined nucleic acid segments should be 
inserted into a vector, if not already present in such a vector during the recombination step, 
the vector encodes a selective marker capable .of being expressed in the cell type for which 
15 • uptake, is desired. If the DNA binding protein being evolved recognizes a specific binding 
site (e.g., /acl binding protein recognizes lacO), this binding site can be included in the, 
vector. Optionally- the vector can contain multiple binding sites in tandem. ... 

The vectors containing different recombinant segments are-transformed into 
host ce|ls,.usually E. coli^ xo allow recombinant proteins to be expressed and bind to the 
2 0 .vector encoding their, geneticmaterial; Most cells take up.only a single vector and so 

■ transformation, results in a populationof.cells. most.of which contain a single species of 
vector. After an appropriate period to allow for expression and binding, cells are lysed under 
mild conditions that do not disrupt binding,of vectors to DNA binding proteins. For example, 
a lysis buffer of 35 mM HEPES (pH 7.5 with KOH}, 0.1 mM EDTA, 100 mM Na glutamate, 

2 5 ■ 5% glycerol, 0.3 mg/ml BS A, 1 mM DTT, and 0. 1 mM pMSF) plus iysozyme (0.3 ml at 1 0 

mg/ml) is suitable (see Schatz ei al., US 5.338,665). The complexes of vector and nucleic 
acid binding protein are then contacted withcells of the type for which improved or altered 
uptake is desired under conditions favoring uptake (e.g., for eukaryotic cells, recipient cells 
can be treated with calcium phosphate or subjected to eiectroporation). Suitable recipient 

3 0 cells include the human cell types that are common targets in gene therapy, discussed 

elsewhere in this application. 
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After incubation, cells are plated with selection for expression of the selecti ve 
marker present in the vector containing the recombinant segments. Cells expressing the 
marker are recovered. These cells are enriched for recombinant segments encoding nucleic 
acid binding proteins that enhance uptake of vectors encoding the respective recombinant 
segments. The recombinant segments from cells expressing the marker can then be subjected 
to a further round of selection. Usually, the recombinant segments are first recovered from 
cells, e.g., by PCR amplification. The recombinant segments can then be recombined with 
each other or with other sources of DNA binding protein variants to generate further 
recombinant segments. The further recombinant segments are screened in the same mamier 
as before. 

In a variation of the above procedure, a binding site recognized by a DNA 
binding protein can be evolved instead of or as well as, the DNA binding protein. DNA 
binding sites are evolved by an analogous procedure to DNA binding proteins except that the 
starting substrates contain variant binding sites and recombinant forms of these sites are 
screened as a component of a vector that also encodes a DNA binding protein. -T^^ 
Evolved nucleic acid segments encoding DNA binding proteins and/'or f 
evolved DNA binding sites can be included in gene therapy vectors. If the affinity of the .1 
DNA binding protein is specific to a known DNA binding site, it is sufficient to include that 
binding site and the sequence encoding the DNA binding protein in the gene therapy vector^ 
together with such other coding and regulatory sequences are required to effect gene therapy. 
In some instances, the evolved DNA binding protein may not have a high degree of sequence 
specificity and it may be unknown precisely which sites on the vector used in screening are 
bound by the protein. In these circumstances, the gene therapy vector should include all or 
most of the screening vector sequences together with additional sequences required to effect 
gene therapy. 

An exemplary selection scheme is shown in Figure 2. The lower left portion 
of the Figure shows two vectors, each having the same marker and DNA binding site, the 
vectors differing in the recombinant segment encoding a DNA binding protein. The vectors 
are transfected into E coli cells. The vectors are expressed in the cells to produce DNA 
binding proteins, which differ between the different cells. The recombinant binding proteins • 
complex with the vectors encoding them and these complexes are preserved after cell lysis. 
The complexes are then contacted with a recipient eukaryotic cell. The eukaryotic cell bears 
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several different cell surface receptors, one of which can interact with one ,of the DNA 
binding proteins to facilitate uptake of DNA. Selection for expression of the selection marker 
on the vector identifies cells transformed with vector. These cells are enriched for 
recombinant segments conferring enhanced DNA uptake; 
, , • • nvrmproved Intranelliilar SfahilitV of a VgCtOr 

. Vectors with greater and improved celi retention, inuracellular stability and 
expression properties can be developed using the recursive recombination methods of the 
invention. In many gene therapy methods, it is desirable that the vector be stably maintained 
. in target cells and thereby be capable of indefinite expression. This is the case for both viral 
and nonviral vectors. Substrates and recombination formats for evolution of vectors toward 
improved retention can be chosen according to the principles described above.' I f the 
substrates are fragments of vector genomes, the recombination products are reinserted into 
vector genomes before screening. The vector genomes can pften.contain a selective marker 
replacing or ftised to the;therapeutic coding sequence carried by the vector in actual use. For 
screening, vector genomes containing recombinant segments are imroduced into cells, if they ' 
are not already so present as a result of in vivo recombination. The cells are grown for a 
number of generations without selection for the marker,".thereby renecting .the situation, in 

■ vivo, in which it is typically not possible to select for retemion of a therapeutic gene. After an 
apfsropriate period of growth, selection for the'marker is applied and, surviving cells, 
recovered; These cells can contain vectors harboring recombinant segments conferring the 
propeny of improved retention (i.e.; recombinantsegments stably maintained) in a cell. In 

■ some instances, the properties of improved retention, at least; in part, a consequence of _ , 
improved, more stable integration into the cellular genome. Recombinant segments having 
the property of improved replication, retemion and/or stability are recovered from cells, and 
subjected to a further round of recombination, either with each other and/or with fresh 

.'.substrates to.generate further recombinant segments, These are screened in the: same manner 
as the previous recombinant segments, . 
fjO Reduced Immunop yn''''^'^ "f^^^^^ors 

Protein and.nucleic acid sequences, with reduced immiinogenicity can be 
developed using the recursive recombination methods of the invention. Immunopnicity is a 
. particular concern with viral vectors, since a host immune response, including CTL mediated 
and humoral responses, can prevent a virus from reaching its intended target particularly in 
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repeated administrations. Cellular immune responses preventing a virus from reaching its 
intended target can also be induced against nonviral vectors administered in naked form or 
shielded with a coat such as liposomes. 

Host immune responses which eliminate infected cells is aJso a major problem 
in gene therapy. CTLs are primarily responsible for the elimination of infected cells, 

* 

although the problem ,can also be partly or entirely antibody-mediated. The recursive 
recombination methods of the invention can be used to modify a virus to reduce this 
(primarily cellular) immunity against virally infected cells. In a variation of this embodiment, 
for adenovirus-mediated gene transfer, adenovirus late gene expression is reduced by . 
mutations induced by the methods of the invention to reduce CTL responses which contribute 
to the elimination of virus-infected cells. Thus, the problem of transient retention of virus 
which can be seen in adenovirus-mediated gene transfer is alleviated. 

Substrates and formats for recombination generally follow the principles 
discussed above. In general, regions of the viral genome encoding outer surface proteins 
provide the most likely initial substrates for evolution toward reduced immunogenicity. j\. 
Altemativeiy, the whole vector genome can be included as an initial substrate for -i^ 
recombination. Recombinant viral ger."mcs should be packaged as viruses before screening, 
and nonviral genomes should be prepared in the proposed composition for therapeutic 
administration (e.g., liposomes). - ^ 
Viruses containing recombinant genomes or nonviral genomes appropriate! >^., 
formulated are administered to a mammaK such as a mouse, rat, rabbit, pig, horse, primate.qr 
human, and surviving viruses or nonviral genomes are recovered after an appropriate interval. 
Often the administration is i.v. and surviving viruses and nonviral genomes are recovered 
from the blood. Surviving viruses and nonviral genomes are enriched for recombinant " ' 
segments conferring the property, of reduced immunogenicity. These recombinant segments 
are used as some or all of the substrates in the next round of recombination. Subsequent 
rounds of selection follow the same format. 

In a variation of the above format, antibodies are collected from mammals 
immunized with the viral library, and immobilized on a column. Another aliquot of the viral 
library, or a derivative library resulting from a further round of recombination, can then be 
applied to the column and viruses passing through the column collected. These viruses are 
enriched for viruses with low immunogenicity. 
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. Ina variation of this method for nonviral vecior^^ 
product of the vector is expressed as a fusionprotein joinedjaa DNA binding protein that, 
has affinity for a sequence' on the vector. In this way, at ieast some of the expression product 
is'maintained in physical proximity with the vector producing it.' Thus, immune responses- 
directed against the expression product also remove the vector sequerice. Accordingly, 
recovery of vector sequences surviving a period of time in an animal; enriches both vector 
sequences that themselves have low immunogenicity and which:enc6de: expression products 
with low immunogenicity. 

. ^ ^ (D Reduced Toxicitv o f Vectors : ; . J . 

'7 , ^ Protein and nucleic acid sequences with reduced cellular toxicity can be ' 

developed using the recursiverecombination methods of the invention. Toxicity caused by 
yiral gene expression is sometimes a concern when using viral vectors in gene therapy. Thc^ 
methods of the invention can be 'used to, induced and select for multiple. combinations of ;^ 
^mutations' blocking viral DNA replication and gene expression in vivo. To produced the . 
crippled viruses in-vitro; these mutations should be conditional "mutations, such as; 
temperature sensitive or nonsense mutations so that the' mutant viruses can be propagated m: 
;vitro under permissive conditions. The multiplicity and hence redundancy of the. conditional 
mutations' prevents the mutant virus from reverting back'to the wildtype genotype or ■ 

phenptype. ^ • 

VmV Improved Specitlcitv of Integration ^ 

Vector sequences with improved specificity of integration can be developed; . 
using the recursive recombination methods of the invention. For example. AAV is known to 
integrate preferentially at a site in chromosome 19ql 33. integration at this site is 
advantageous since the presence of an' exogenous DN A sequence at this site does not appear . 
* to have- any adverse effect on expression of endogenous cellular genes. Tt, is therefore ■ 
desirable to be able to increase the specificity of AAV to this^site; ^ 

The starting substrates for recombination are AAV vectors including at least 
. ITRs and, optionally, a rep gene, since the latter may have a role in site-specific 
. recombination. Genes from other viruses known or believed to have a role in site specific \ 
integration can also be included. Preferably the genomes include a marker sequence. : 
Recombination proceeds through any of the recombination formats previously discussed to 
produce a library of AAV viruses having different recombinant segments in their genomes. 
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The AAV viruses are used to infect appropriate target ceils. Cells having taken" up AAV 
. DNA can be recognized from expression of the marker. Genomic DNA is" isolated from these 

cells/and a region centered on the intended site of integration is amplified by PCR. The 

amplified regions are enriched for recombinant segments conferring the desired property, of 
■ site-specific integration. These recombinant segments form the starting materials for the next 

round of recombination. 

Analogous principles apply to other viral vectors and, indeed, nonviral 

sequences and vectors. For example, as discussed above, one embodiment of the invention 
utilizes site-specific integration systems to target a recombinant sequence of interest to a 
specific, constant location in the genome. A preferred embodiment uses the Cre/LoxP or the 
related FLP/FRT site-specific integration system. The Cre/LoxP system uses a Cre 
recombinase enzyme to mediate site-specific inscnion and excision of viral or phage vectors 
inlo,a specific palindromic 34 base pair sequence ("LoxP site"'). The recursive sequence 
recombination methods of the invention can be used lo modify these systems, such as to 
improve specificity of integration, create alternate, specific sites of integration, modify ' 
recombinase activity, and the like. 

In a further embodiment, it is not necessary that the starting vector have any 
preferred integration site. If this is the case, a suitable chromosomal site unlikely to interfere 
with expression of other genes is chosen, and successive cycles of recombination and 
selection performed until a vector has evolved to integrate preferentially at that site. 
rN) Impro ved Resistance to Microorganisms 

The recursive recombination methods of the invention can also be used to 
develop new or improve upon known inhibitors of microbial and viral infection, including 
trans-dominant inhibitors of microbial and viral replication and gene expression. In some 
gene therapy applications, the vector can encode a product that is an inhibitor to a 
microorganism, such as a virus. Because of the complexity of viral life cycles and the 
intrinsic mutability of viruses, recursive sequence recombination is a practical tool for 
evolving protective antiviral constructs with improved potency and/or new or improved 
specificities, this can be accomplished using any variety of mechanisms. For example, the 
gene therapy vector can encode an antisense RNA that blocks expression of a viral or other 
pathogen's mRNA. The antisense RNA can be designed to bind to a key regulatory sequence, 
sucK as a promoter, or to the coding sequence, or both. Alternatively, the vector can encode a 
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■protein that is inhibitory to the^replication or gene expression of a pathogen. , Forixample, a 
number of gene therapy strategies have been designed with the intent of inhibiting HW- l 
replication in mature T cells. As T cells are products of heinatolymphoid difjerentiation, 
insertion of antiviral genes into hematopoietic stem cells serves; as' a vehicle to. confer 

, long-term protection in progeny T cells derived from, transduced stem cells. One such , 
■•cellular immunization" strategy utilizes the gene coding for the HIV- 1 rev trans-dominani ' 
mutant-protein RcvMl 0 which has been demonstrated to inhibit HIV-1 replication in T-cell ; 
Jines and in priihary T cells; as described in Bonyhadi ( 1.997) J. Virol. 71;:4707-4716; Nabel 
.(.1996) Ge«e r;;erfl/7;;,,abstract 361, CSH. HIV-1 tat and rev mutants have also been . 

"suggested as potential intracellular, trans^dominant inhibitors of HIV-1 replication, Gaputo 
(1997) Ge«6 r/ier. 4:288-295. Another candidate for development by the methods of the 
invention is the trans-actmg transcriptional . regulator)' protein 1 kappa B alpha, which can act, 
as a cellular inhibitor of human retroviral replication through- a mechanism mdependent of its 
effdct on HIV transcription, see Wu (1995) Proc Nail. Acad Sri: OS/! 92: 1 480-1484. 
Repeats.bf inhibitofs derived from viral Xragmenis. such as poly-TAR constructs, can also be 

- used as inhibitors of HI V- 1 ' gene expression. T.AR is an RN A stem-ibop structure bound by 

, activators or inhibitors of HIV-l gene expression. TAR can be'used to mediate (for example, 
saturate) cellular factor/RNA interactions, and it has been suggested that transcriptional 
activators (such as Tat) action might be inhibited by such competing TAR reactions in v/»-a; 
see Baker (1994) Nucleic Acids Res :Il:2i65-23li. The recursive recombination methods of 
the invention can develop, and improve upon these and related intracellular inhibitory systems 

- -'. There-are also many 'examples where a protein from one virus or viral 
can be inhibitory to the develbpmentof another. Woffendm Natl. Acad Sci. ; 

USA 91:1 1581-1 1585., In particular, at least one protein from adeno-associated virus (AAV) 

. is known to be inhibitory to HIV: The large rep gene products. Rep78 and Rep68, of AA V ' 
arg^leiotropic effector proteins which are required fbr AAV-DNA replication and the . 

. trans-regulation of AAV gene expression. Apart from these essential functions, these re^ 

products are able to inhibitthe replication and gene expression.of HlV-1 ana a number of 
DNA viruses. Batchu (1995) F£BS ie//. 367:267-271 ; Antoni (1991) J. Virol. 65:396-4W. 
The recursive recombination methods of the invention can develop new and improve upon 
these inter-viral inhibitory proteins. : , 
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- ■ . The present invention provides a means for improving the inhibitory qualities 
of the anti-sense RNAs and proteins described above and also for identifying new inhibitory 
agents. . The improvement to known inhibitory reagents can reside in seyeral aspects, such as 
improved expression, improved stabih'ty or altered fine-binding specificity. It is not 
necessary in the present methods to know which of these contributory properties is being 
improved; rather the selection is for the ultimately desired property of microorganism 
resistance. : . 

For evolution of known inhibitor)' agents, substrates for recombination and 
recombination formats are selected according to the principles discussed above. The 
substrates can be viral vector genomes or the pans thereof encoding the inhibitory agents and 
associated regulatory sequences. Initial diversity in recombination substrates can be natural 
or induced. After a round of recombination, the recombinant segments are introduced into 
cells (if they are not already in cells as a result of in vvvo' recombination) and the cell are 
contacted with the microorganism for which proteciion is desired. Cells surviving exposure ■ 
to the microorganism are enriched for recombinant segments conferring resistance to the 
microorganism. These recombinant segments form some or all of the substrates for the next 
round of recombination. 

Similar principles can be applied for de novo identification of inhibitory agents 
to be expressed from gene therapy vectors. More rounds of recombination and screening can 
be required to obtain satisfactory results. For example, sequences coding for viral proteins 
from the virus to be inhibited or other viruses provide suitable initial substrates for : 
recombination. The coding sequences can;be obtained from the same or different viruses and 
natural diversity can be augmented by inducing additional mutations, e:g., by error-prone 
PCR, as described above. Recombination and screening are also performed as described 
above. 

In an illustrative embodiment, a library of mutants is constructed based on 
candidate construct(s), examples of which are described above. The libraries are transduced 
or transfected into target cells. The cells are challenged with the microorganism of interest. 
Resistant cells are isolated based on, for example, survival against cytopathic virus or lack of 
expression of viral encoded genes, which can include inserted marker genes such as GFP. 
These methods are used to detect cells in which viral replication or gene expression has been 
blocked. FACS or panning with an antibody against a virally encoded or induced surface 
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epitope is used in a positive selective step. Genes encoding resistance factor are recovered, 
for example, by PCR. The recovered gehes can be subjecied to fiirther rounds of recursive 
sequence recombination, as described above, until a desired level of protection against the 
microorganism is achieved ' 

Further illustrati ve examples of anti-viral mechanisms- which can' be improved 
by the methods of the invention mcludc anti-viral ribozyme systems. For example, one or 
more ribozymes can be targeted against a viral RNA. Adenoviruses have been used to deliver 
anti-hepatitis C ribozymes; see Lieber.(1996) J. FW. 7d:8782-8791; Ohkawa(I997) 7. ' 

Hepatol. 27:78-84'. HlV-1. Rev response element (RRE) region-specific hammerhead 

ribozymes will completely inhibit HIV- 1 replication, see Duan.(l 997) Gem Ther 4:533-543. 
Sendai virus polycistronic P/C mRNA can also be cleaved by ribozymes; Gavin (1 997)7. 

Biol. Chern. 272:1461-1472.. 

■Anti-viral cytokines can also be iriiprovcd by the methods of the invention 
For example, wild type or chimeras of wild type interferons such as the IFN alpha 17, IFN 
beta and IFN gamma constructs can be subjected to recursive sequence recombination. These 
sequences can placed be under the control of a virus-activated promoter, such as an HI V 
mini-LTR; see Mehtali (1 996) Gene Therapy, abstract #364,. CSH. For example, cell lines 
stably carrying IFN transgcnes under the positive control of the HIV-1 Tat protein are highly 
resistant to HIV-1 replication in vitro. Jh\s antiviral resistance is associated, with a strong 
induction of IFN synthesis immediately following the viral infection. However, 
IFN-gamma-transfected cells permuted. HIV-1 infection in vivo despite the induction of a 
high level of IFN-gamma secretion, see Sanhadji ( 1 997) AIDS 1 1 :977-986. The methods of 
the invention can be used to develop this ami-viral system for potency and effectiveness ,n 
vivo. 

The methods .of the invention can be used to develop single chain or Fab . . 
antibody fragments directed intracellularly to viral components; Mar^^^ 
Therapy, abstract 1 60, CSH, For example, one strategy for somatic gene therapy to treat 
HIV-1 infection is by intracellular expression of an anli-Hl V-l Rev single chain variable 
fragmem (Sfv); Duan {1997) Gene Ther, supra. Intracellular expression of Sfvs which bind 
to HIV integrase catalytic and carboxy-terminal domains results in resistance to productive 
HIV-1 infection. This inhibition of HIV-1 replication is observed with Sfvs localized in either 
the cytoplasmic or nuclear compartment of the cell. See Levy-Mintz {\996) J. Virol. 
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70:8821-8832. The expression of anti-reverse transcriptase (RT) Sfv in-T-lymphocytic cells 
specifically neutralizes the RT activity in the preintegration stage and affects the reverse 
transcription process, ari.early event of the HIV- 1 life cycle. Blocking the virus at 
these early stages dramatically decreased HIV-I propagation^ as well as.the HIV- 1 -induced 
cytopathic effects in susceptible human T lymphocytes, by impeding the formation of the 
proviral DNA. See Shaheen (1996) J. . Virol. 70:3392-3400. The methods of the invention 
can further develop the potency and range of suchanti-yiral, intracellular antibody fragments. 

Improved virus-binding aptamers or peptide ligands, directed to viral . . 
components^ as those described above, can also be further developed by the methods of the 
invention. For example, RNA aptamers that recognize^a peptide fragment of human HIV-1 
Rev were found to bind the free peptide more tightly than'a natural RNA ligand. the 
Rev-binding .element, see Xu (1996) Proc. Natl. Acad Sci, USA 93:7475-7480; Symensma 
(1996)7. K/ro/. 70:179-187. Aptamer sequences isolated from single-stranded DNA 
preparations have thrombin inhibitory activity, indicating that thrombin-inhibitory aptamers 
are present in the mammalian genome and may constitute an endogenous aniiihrombin 
system. Analogously, the recursive sequence methods of the invention can be used to further / 
identify, develop and improve aptamer sequences useful as anti-microbial agents, or for gene 
therapy in general. 

(0) Viral Packaging Cell Line? 

' The recursive sequence recombination methods of the invention can also be 
used to develop new and improved viral packaging cell lines Viral vectors used in gene 
therapy are usually packaged into viral particles by a packaging cell line. The vectors 
typically contain the minimal viral sequences required for packaging and subsequent 
integration into a host, other virafsequences being replaced by an expression cassette for the 
protein to be expressed. The missing viral functions are supplied in trans by the packaging 
cell line. For exainple, AAV vectors used in gene therapy typically only possess ITR 
sequences from the AAV genome which- are required for packaging and integration into the 
host genome. Viral DNA is packaged in a cell line, which contains a helper plasmid 
encoding the other AAV genes, namely rep as\dcap, but lacking ITR sequences. The cell line 
is also infected with adenovirus as a' helper. The helper virus .promotes replication of the 
AAV Vector and. expression of AAV genes froni the helper plasmid, , The helper plasmid, is 
not packaged in significant amounts due to a lack of ITR sequences. Contamination with 
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adenovirus can be reduced by„e.g-, heat treatment to which adenoyirus js more sensitive than 
AAV. AAV recombinants -are generaliy produced by transient cp-transfection methods since 
it has proven difficult to generate stable packagi^ 

Methods,63\l29-l36). ' ' ' . 

packaging cell lines; increasing the yield of AAV vector packaged; decreasing- the ratio of 
AAV prbgeny to helper virus; and Reducing the toxicity of the re/? gene, to the p^ckagingxeiK 
which in turn leads to a greater yield of AAV. The leading candidate genes for evolution/ 
modification by the methods of the invention are the AAy replication (rep) and capsid (ca/?) 
genes, which can be present on the AAV-helper plasmid.. Overexpressibn of the rep gene can 
decrease AAV DNA replication dnd severely inhibit- cap gene;expressibn and reduced 
rep level enhances eap gtnc expression and supports normal rAAV DNA replication,- Thus, 
recursive recombination modification of rep genes and their expression can generate - ? 
increased AAV vector production/see Li (1997) J. Virol. 71:5236-5243. : . ■ . ^ 

-These and related sequences can be subject to recijrsive, sequence 
recombination according to' the general principles di'scusscd. that is, vanant forms of these , 
genes are recombined, either i/rv/va or in v/Yro.'and cells containing recombinant segments- 
resulting from recombination are screened for a desired property, -such as stable packaging ^ 
cell lines; yield of packaged AAV; .increased. viabiHty of cells; or, low yield of helper virus 
relative to packaged AAV. .■-the same principles can be applied to evolve genes in the helper 
aderibvirus, either concurrently or consecutively-with the evolution of AAV genes on the ' 

helper plasmidv ; . . : . ' ^ ' ■ : ' ' 

■ Cellular genes-in the packaging cell line affecting packaging can also be ^ ^ 
evolved even without knowing what these genes are, This is achieyed by transforming the 
packaging cell line with a library of genes, some of which will undergo recombination with 
cognate genes inthe packaging cell line. The library- of genes can be obtained from another 
type or species of cell or can be:a mixture.of several types and species anchor can. have 
diversity induced by processes such as error-prone PGR. - Cells containing recombinant genes^ 
arc screened for improved packaging properties, such as increased yield of AAV virus. 
Optionally, a further library can be transformed into the cells surviving screening in a 
previous round.' Alternatively; the pool of surviving cells can be divided in two, and DNA 
isolated from one half and used to transform the other half In this way, the best recombinant 
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segments identified in the first round of screening undergo recombination with each other in 
the second round of recombination. 

EXAMPf.F.S 

Exampjg 1; M13 scFv r^ibrary 

This example shows in vivo panning of libraries of bacteriophage displaying 
scFv for localization to a predetermined cell type, such as a xenogeneic neoplasm. A scFv 
antibody-phage display library was constructed as described in Crameri (1996) Nature 
Medicine 2:100-102. After growth of the phage library on E. coli TGI in LB containing 50 
Mg/ml kanamycin, bacterial cells were removed by centrifligation and the phage precipitated 
by addition of PEG to 4% and NaCl to 0.5 M final concentration. After one hour incubation 
on ice, the solution was centriftjged at 8,000 x g for 30 minutes, and the pellet resuspended in 
Dulbecco's phosphate-buffered saline (DPBS). 

Male Sprague-Dawley rats were anesthetized and phage were injected 
intravenously and blood sampled arterially via ipsilateral femoral arterial catheters. EDTA 
was used in blood samples to reduce coagulation. Blood samples were taken immediately 
before administration of phage and at 5 30, 60, 120, and 240 minutes post-injection of 7.6 x 
10'' colony forming units. Phage liters were determined by dilution of whole blood in DPBS 
and infection of E. coli TGI to assay colony forming units of M13: Four repetitions of the 
protocol were performed. It was found that M 13 bacteriophage remained stable and 
infectious (to £. coli) with a half-life of six hours in rat blood after in vivo injection. 

E;<ample 2: Panning of M13 scFv Library for Specific Localization 

A scFv antibody-phage display library is administered to mice having 
transplantable human tumor grafts. After a suitable incubation time, tumor tissue is harvested 
and phage are eluted from the harvested tissue by homogeriization of the tissue sample. 

An aliquot of the recovered phage is subjected to at least one additional cycle 
of administration and selection in vivo by the same protocol. 

An aliquot of the recovered phage is used to purify DNA and the recovered 
DNA is recursively recombined by shuffling in vitro, and the resultant population of shuffled 
Ml 3 genomes is introduced into E. coli and packaged; a library of shuffled M l 3 species is 
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recovered and administered to mice for at least one additional cycle of administration and 

selection j'nv/vo by the same protocol. - '' 

■ ■ Aji aliquot of the recovered phage is used to infect £.. co;; at a high multipH^ 

of infection to recursively .recombine M13 genomes in vivo by shuffling, and the resultant . 
population of shuffled M13 genomes is introduced into E. coU and packaged; a library of 
shuffled MI3 species is recovered and administered to tnice for at least one additional cycle 
of administration and selection m VIVO by the same protocol. ■ 

F.xamplel: E volution of the MOMTecne / :, 

This example illustrates .evolution of the MGMT gene to confer improved 
properties for protection ofhumanbone marrow against alkylating agents. The wild-type , 
human MGMT cDNA on a high copy number plasmid was amplified by PGR and randomly 
fragmented with DNase. Smiill fragments (50-IOObp) were reassembled into full-length 
fragments by Taq DNA poK'merase without outside primers in a process that induces point , 
■mutations.in a rate proportional to the size of the starting fragments;.see Stemmer ( 1 994) 
Proc. Nail. Acad. Sci. USA 91 :10747-1075l.. Shuffling the:entire gene, which'encodes 207 
amino acids, allowsmutagenesis of all regions of theprotein including the functionally- 
important DNA-binding region (Kanugula {\995) Biochemisiry 34:71 13-7119)-. .Full-length 
fragments .were cloned back mto the vector and transformed into llkyltransferase-deficieni £ : 
CO// (strain GWR 11 h ada of^i) (Rebeck (1991) /. ZJacferio/, 173:2068-2076). Relatively ' 
largd numbers of mutations were created to increase diversity artd.because inaciive.vanants 
can be eliminated with stringent genetic selection by alkylating. agents. This selection 
involves treating the bacteria with the methylating agent MNNG three sequential times, each 
separated by a one-hour recovery period during which the bacteria are allowed to make more 
.MGMT. The triple selection kills cells having inactive MGMT and preferentially selects for 
proteins having improved expression and/or activity of MGMT. , '. 

An "improved human MGMT gene was also generated using both natural and 
unnatural -encoding sequence diversity. Unnatural diversity was created by the random 
fragmentation of the human MG^^• (wild,type:/l^GA^ cDNA was generously provided by 
Dr. S. Mitra. University of Texas. Galveston; see Tano (1990) "Isolation and structural 
characterization of a cDN A clone encoding the human DNA repair protein for 
06-aikylguanine," Proc. Natl. Acad Sci. USA 87:686-690, for cDNA and protein sequences 
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and for residue numbering). This was followed by the reassembly of fragments in a 
mutagenic DNA shuffling reaction. Active variants, selected for their ability to confer 
MNNG resistance to alkyltran^ferase-deficient E. coli, were pooled, remutated, and 
recombined in subsequent cycles of shuffling (the alkyltransferase-deficient (a^/^ ogt) E. coli 
strain GWRU 1 was provided by L. Samson, Harvard University, Cambridge, MA; Rebeck 
(1 991 ) J. Bacterioi 1 73:2068-2076). Two cycles of conventional DNA shuffling were used 
■ to build up the unnatural diversity: 

The wild-type human alkyltransferase {MGMT) cDNA was subcloned into 
pUCllS plasmid fNew England Biolabs, Beverly, MA) and a translationaliy silent ATzoI site 
created at coding nucleotide residue number 380 (Tano ( 1 990) supra, for residue numbering). 
The flanking non-coding sequences were removed from that construct and an E. coli 
ribosome-binding site added via PGR amplificaiion with oligos 1 and 2 (see below) and 
inserted into the EcoYa-HinDWl sites of pUC ! 1 8. 

Oligo^^l; 5*-GCATCCGAATTCCTTAAGGAGGGGAAAAATGGACAAGGATTGo' 
Oligo U2: 5'-CCGCTAAAGCTTCATACTCAGTTTCGGCCAG -3' - 
This construct is designated ''pFCH." The; sequence of the entire MGK'iT.gtnt in prC14 was 
verified, as was ijs ability to complement GWRl II. A non-functional dummy vector was^^ 
constructed by replacing the active site-encoding region between xhtXhol and /*//7AI sites 

(nucleotide residue numbers 380 to 521 (Tano (1990) supra, for residue numbering) with a^ 

• ■ ■■ K 

synthetic stuffer duplex made by annealing oligos 3 and 4 (below). 

,01igo#3: 5'-TCGAGCCCCAGGCCTCCGCA-3' . 

01igo#4: 5'^CCGGTGCGGAGGCCTGGGGC-3'' ' 
The inactivity of this gene w:as verified by its inability to cpmplenient GWRl j 1. The dumrny 
vector, with the shortened MGMT removed, was used as a cloning vector for library 
construction to reduce the possibility of contamination by, wild-type A/GA/T. . 

The general procedure for creating randomized gene libraries by random 
fragmentatipn and reassembly was used as described in Stemmer (1994) Proc, Natl, Acad. 
ScL USA 91:10747-10751; and Stemmer (1994) Nature 370:389-391. The starting material 
was a 1.2 kbp (kilobase pair) PGR product made from pFC14, generated using the outside 
primers oligo #5: 5'-AAGAGCGCeCAATACGCAAA-3', and oligo #6: 5'- 
TAGCGGTCACGCTGCGGGTAA-3', and Ta? DNA polymerase (Promega). This product 
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contained the human MGA/T plus pFCl 4 flanking: sequence from which 50-3pO bp random 
fragments were prepared and reassembled with fa^ DNA polymerase, as in.Stemmer,(1994) 
Proc. Natl. Acad .Sci USA. supra, arid-.Stemmer,(::1994) A^a/w/re, Reamplif.cation with 

the nested primers oligG.#7: 5'^ATGCAGCTGGGACGACAGGTT^^^^^ 
TACAGGGGGCGTACTATGGTT-3', gave a 980-bp fragrnent which was treated with EcoRI 
and HiriDm. The resulting 650-bp fragment was ligated into the' dummy vector described 
above.: The ligation mixture was electroporated into^GWRl 1 1'.. yielding libraries of -10' per 
cycle from which active clones were selected. Selection was done as described in Christians 
:(p9^pyoc. mi Ac^^ 93:61241(5128, withtheexceptionofomission of the 

inducer isopropyl-beta-ihiogalactopyfanoside. Bacteria in- culture were treated with 3 
sequential doses of MNNG, each separated by a I -hour recovcr>'..period'. After the third dose 
all cells were spread oh plates. The next day colonies were pooled, and the MGhfT DNA l or 
the next cycle was prepared' by PGR with oligos #5 and>6 (above). This procedure was 
repeated for a total of 6 cycles. The MNNG treatment was made- progressively more stringent 
as the shuffling progressed, starting at 3- x lG ug/ml NINNG upip.as much as SO ug/m! inlater. 
cvcles. Likewise, fewer colonies were picked for shuffling in later cycles. ; ! 
* : . the natural diversity of four known mammalian alkyltraiisferas.es - rat, mouse. 

- hanister, and 'rabbit - was also used to generate sequence diversity in the., improved human 
MGMT gene. An aligmnent of their protein sequences, as'shown- in Figure 4, reveals regions 
of extensive homology as well as regions of diversity . There exist 2 X 10'« c 
known natural amino acid substitutions from mammalian alkyltransferases,(52 positions with 
2 amino acids.represented;. 24:pbsitions with 3 amino acids, and-2 positions with 4 amino 
■- acids = 2" x 3" x 4^). This diversity was exploited through the use of 2 \ degenerate 
oligonucleotides (Figure 3) These dUgos were mixed together in equal proportions to create 
one diverse pool, which was mixed with the DNA fragments du^^^ 

in'the'third and fourth cycles; ^Several different, molar ratios of oUgos-.fragments were made, ■ 
and it was observed that high concentrations of oligonucleotides inhibited reassembly, 
probably because the large number of base pair mismatches overwhelmed the polymerase. Of 
those mixtures giving proper reassembly, as judged by correct product size after 
reamplification, the one containing the highest proportion of oligos, a molar ratio of I oligo:4 
fragmems, was chosen for further cycling. Annealing of each oligonucleotide to the human- 
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derived 'A/GA/r sequence was enabled by 20 nucleotides of homology on both sides flanking 
the degenerate or non-human sequence: Control PCRs demonstrated that all oligonucleotides 
were approximately equally capable of hybridizing to. the human sequence. 

In the third round of shuffling, the oligonucleotides were combined with the 
sequences generated by oligonucleotides having ''unnatural diversity," that is. the pooled 
human MGMT clones that survived cycle 2. Conditions were varied in an attempt to 
incorporate the oligonucleotides and maximize diversity while maintaining the correct size of 
the assembled product. The largest molar ratio of oligonucleo'tide:fragment to allow correct 
assembly was 1 :4. Because of the limitation in the ratio, the "oligo spiking" was repeated in 
cycle 4. The pools in cycles 3 and 4 were thus hybrids containing randomly mutated human- 
derived sequence as well as different combinations of mammalian MGMT gene segments. 
These pools were subjected to selection between cycles. Two final rounds, cycles 5 and 6, of 
"conventional shuffling," without addition of oligonucleotides; were performed in an attempt 
to further evolve the hybrid proteins. 

Individual clones surviving later cycles were screened for improvement by 
treating them with a single 40 ug/ml dose of MNNG and comparing survival to untreated 
samples. The best perforining clone, irom cycle 4, showed a 10- fold improvement over the 
wild-type at this dose. Its deduced protein (amino acid) sequence, shown in. Figure 5 (SEQ 
ID N0:2), based on the improved (evolved) nucleotide sequence (SEQ ID N0:1), contains 7 
amino acid differences from the wild-type human alkyltransferasc (see the seven circled ' 
amino acid residues in Figure 5), 5 of which are found in other mammalian alkyltransferases 
(boxed residues in Figure 4). These 5 amino acid changes presumably were encoded by the 
oligonucleotides spiked in during cycles 3 and 4. All 5 were encoded by the same degenerate 
oligonucleotide pool, #7 in Figure 3. The other amino acid changes, Q (gin) to R (arg) at 
residue number 72 {Q72R) and G (gly) to D (asp) at residue number 1 73 (Gl 73D) (Tano 
(1990) supra), were not present in the natural diversity and thus were created by the 
mutagenic shuffling process. In addition, 2 translationally silent nucleotide changes (from the 
wild type) were detected (see the two underiined nucleic acid residues in Figure-5). 

This shuffled mutant was characterized more thoroughly for its activity in E. 
coli. In-one set of experiments, cells were treated with graded doses of MNNG and the 
surviving fraction determined. Piasmids isolated from individual clones surviving the 
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MNNG treatments were retransfom^ The retransformed clones were ■ / 

screened individually by treating them with a single 40 ug/ml dose of MNNG. The be jt , 
performing clone was further analyzed three ways: (/) The entire MGA^T DN A sequence was 
obtained by sequencing the DNA.target bidirectionally using fluorescent dye terminator cycle 
sequencing methods (Applied Biosystems 373A Autosequencer, Foster City, CA); ;(//) Kill 
curves were established by treating exponentially growing cells with graded single doses of 
MNNG in the absence of isopropyl-beta-ihiogalactppyranoside and measuring colony- 
forming ability relative to untreated controls. Cells harboring the wild-type gene (pFC 1 4) or 
the VI 39F mutant (Christians (1996) Prod NatLAcad Sci. VSA. supra) were treated in. 
"parallel for comparison;. (///) The alkyltransferase activity of bacterial extracis was 
quantitated hy in vitro exposure to calf thymus DN A containing (y^-[^H]methylguanine as 
described in Bobola (1995) Molec. Carcinogen 13:70-80. Some extracts were preincubated 
with the mammalian alkyltransferase inhibitor (9^-benz\iguahine. 

- Survival was greater than for ceils harboring either the wild-type human 
MGMT or the VI 39F' mutant. The LDuj's, or dose of ^^^^^G giving 10% survival, vvere: ; 
wild-type, 17.5 ug/ml; V139F, 25 ug/ml; and cycle 4 shuffled mutant, 3 3 ug/ml. In a second " 
set of experiments, bacterial extracts were exposed. /n viiro to an excess of [^Hj-mcthylated 
DN A 'Substrate, primarily in the form of (T'-methylguanine, to measure total alkyltransferase 
activity. Average insoluble counts per minute per ug of total pro:'::in were: wild-type, 1 26; 
V'l 39F, 58; and cycle 4.shuffled-mutant, 52, All three proteins were sensitive to the inhibitor 

0^-benzylguanine. . 

'.' Thus, therecursive sequence recombination methods of the invention has 

successftilly generated a new and improved human alkyltransferase protein. The random 

diversity created by the mutagenic shuffling process was augmented by the diversity provided 

by nature. Natural diversity was utilized by simply mixing fragments of the human gene with 

: oligonucleotides encoding all qfthe known mamnialian amino acid substitutions. Homology 

to'the human gene in the sequence flanking the regions of diversity facilitated incorporation 

of the oligonucleotides. The best performing mutant was a hybrid with 7 amino acid 

differences from the human alkyltransferase, as shown in Figure 5 (SEQ ID NO: 1 ). Two of 

the mutations arose spontaneously during shuffling, and the other 5 were encoded by the 

natural diversity, specifically, one of the "spiked oligos" spanning amino acid position 50. 
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Because all oligos were shown by PGR to be capable of hybridizing to.the human sequence, it 
is likely that all. were incorporated into the pool at least to some degree. 

Previous work'witH a different system also confirmed that synthetic oligos in 
such a reaction are incorporated at approximately ihe expected ratios (Crameri (1996) Nature 
Medicine, supra). Another way to incorporate natural diversity is to isolate or synthesize the 
cDNA from each of the species and shuffle the entire coding sequences together. This 
recursive method of breeding natural diversity will improve many related genes from 
'different organisms as well as gene families within an organism. Furthermore, it-can be 
applied to multiple proteins with related motifs, either structural or functional. 

It is difficult to mechanistically rationalize how the amino acid substitutions in 
the shuffled mutant increase its activity in E. coli, Noncof the amino acid positions mutated 
in the shuffled mutant was assigned a function in a computer model of the human 
alkyltransferase based upon the sole alkyltransferase crystal structure, that of the bacterial ^ 
Ada protein C-terminal fragment (Wibiey (1995) Cancer Drug Design ]0:75-95; Moore 
(1994) EMBOJ. 13:1495-1501. The clustering of 5 of the mutations around position 50 is 
striking, but no known function has been ascribed to - his region of the protein. Three of these \ 
5 substitutions are found in all of the other mammalian alkyltransferases. WTiile some 
substitutions might be neutral, a possibility that can be answered by backcrossing, others 
might be synergistic, especially those involving charge changes. The proximity of the .G (gly) ( 
to D (asp) mutation at position number 1 73 (G 1 73D) (see Tano ( 1 990) supra, for residue 
numbering) to the conservea E (glu) at residue number 172 (El 72) might be significant given 
the proposed involvement in crucial salt-link interactions by El 72. An additional acidic 
residue in the region might enhance this effect. - 

The power of DNA shuffling is that it is a molecular breeding process that 
allows for the combination of mutations which incrementally improve many such complex 
effects without having to model the effects in detail. We have exploited this property to 
evolve an alkyltransferase that is more potent in vivo than the natural enzyme or any reported 
mutants, this evolved mutant will be very useful in chemoprotection by gene therapy. An 
improvement overwild-type alkyltransferase is very useful to the clinician by allowing dose 
escalation of alkylating agents without the corresponding toxicity to.the patient. Once- 
promising alkylating agents which are not used because of severe myelotoxicity might now 
become clinically acceptable. Even a slight improvement in alkyltransferase in vivo is useful, 
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given that positive selection allows a relatively small number of resistant cells to repopuiate 
the bone marrow: This alkyltransferase. is further modified to incorporate additional features 
such as C^-benzylguanine resistance. The alkyltransferase can al^o be subjected to additional 
' DNA shuffling and selected for;additiorial' improved activity in mammalian cellsy^such as - * 
■ improved nuclear localization, or better interactipn with the eukaryotic chromatin: structure. 

; Example 4: Whole Genome Shuffling of Virus bv In Vivo Recombination Using Adenoyims- 

Phaggmidg - ^' ■ \ , " ■ . ^ 

'.. .1 , . :This example demonstrates the. cpnstrucxion of an novel adenovirus-phagmid 
using the recursi ve recombination methods of the invention which is capable . of packaging 
DNA inserts over 10 kilobases in size. Incorporation of a phage fl origin. using the methods 
of the invention also generates^ novel mv/vo shuffling format capable of evolving whole 
genomes of vinjses, such as the 3 6 kb family of human adenoviruses. . / 

. " ■ The widely used human adenovirus type 5 (Ad5)/has a genome'size of 36 kb.' 
It is difficult to shuffle this^large genome in vitro without creating an excessive number of 
changes which may cause a high percentage of noriviable rccornbinant variants': Tc rriinimize 
this problem and achieve whole genome shuffling of Ad,5, an aderiovifus-phagemid was, ■ 
constructed using the methods of the invention. . / . 

As outlined in Figure d, the,36'kb Ad5 genome was divided into two 
overlapping parts by restriction digestion/ Each of the two h subcldned into , 

pBR322; the resulting two plasmids. designated pAd-R and p-Ad-L. Specifically, an EcoR f 
ready-made adaptor was first ligated to each end of the linear 36 kb genomic DNA, This 
ligation product was then digested; with BamH I to generate the nght Half of the Ad5 genome 
(nucleotide 2 1 ;562 to 35,935);: arid, with EcoR Ltp generate the-left half of the>genome - 
(nucleotide f to 27,331). : The right half 14.3 .kb BarnH I /EcoR I fragrnent was then ligated . 
.with;BamH I /EcoR I digested pBR322:to create Ad-R, and the left half 27.3 kb EcoR I: ■ ; ^ 
fragment was ligated with EcoR I digested pBR322 to created pAd-L,,. For gene transfer and- 
safety reasons, the Ad5 El region was subsequently deleted from the pAdTL by: first, creating 
an Afl II restriction site at nucleotide 455 using site directed mutagenesis (changing G residue 
at position 457 to aT residue, and a C residue at position 459 to an A residue); and, Afl II 
partial digestion was then performed since there are other Afl II sites in the plasmid. The 
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24.3 kd Afl II fragment was gel purified, filled in with DNA polymerase I to create blunt - 
ends. It was then ligated with a Swa I linker to simultaneously delete the El region between 
nucleotides 458 and 3533, and insert a unique Swa 1 site for insertion of foreign genes: 

To construct phagemids ssDNA phage fl replication origin Avas obtained by 
PGR from pBluescript II KS(-) phagemid (Stratagene, San Diego, CA) and ligated into the 
Cla I site of the Ad plasmids (pAd-R and pAd-L-1) by recombinant DNA techniques, as 
illustrated in Figure 6. The resulting Ad-phagemids were then introduced into a- mutator 
strain ww/D5 (see Degnen (1974) J. BaciehoL 11 7:477-487) to.obtain mutations, thus 
increasing diversity. The spontaneous mutation rate of miitDS strains is approximately 1 .8 x 
IO'%ase pair/cell/generation (see Fijalkowska (1996) /^roc'A^a//. Aca. Sci, USA 93:2856- 
2861), which is about 100 fold lower than that o{ in vitro shuffling (see Stemmer (\99A)Proc. 
Nad Acq. Sci. USA 93:2856-2861). 

To prepare phagemid phage, these mutated Ad-phagemids were purified from 
the mutD5 ceils and then introduced into a F+ recAJ strain (XL- 1 Blue, Stratagene, San 
Diego, CA), and the resulting transformants were infected with a helper M 1 3 phage 
(VGSM13, Stratagene, San Diego, CA) with a multiplicity of infection (MOI) of 10. The 
recAI mutation, which abolishes the :ombinase activity of RecA (see Clark (1965) Proc. 
Natl. Aca. Sci. USA 53:451-459)" is essential for the stability of the 29 kb pAd-L-fl during 
helper phage infection. Stable, high titer (>10'*' transducing units per ml) stocks of Ad- 
phagemid phage were obtained. These ssDNA phages carrying the Ad genome were then 
used to infect a mw/S 20 1 :Tn5 strain (see Siegel ( 1 982) Miitat. Res. 93 :25-3 3) at high 
multiplicity to promote recombination in vivo. Homologous recombination is particularly 
efficient between single-stranded forms of intracellular DNA. After replication, the 
phagemids within the cell behave as regular' plasmids and undergo additionafpiasmid- 
plasmid recombination during subsequent cell propagation. The shuffled Ad-phagemids were 
finally recovered and purified from the cells, and used to transect HeLa cells to generate high 
titer libraries. 

Phagemid vector have been widely used for peptide display, cDNA cloning 
and site-directed mutagenesis (see Mead (1988) Biotechnol 10:85-102 for review). 
However, phagemid vector have not been used with large sizes (inserts) of DNA. 
Conventional phagemid systems have not been used for cloning DNA fragments larger than 
10 kilobases or to generate large-sized (>10 kb) ssDNA. The invention's Ad-phagemid has 
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been demonstrated to accept inserts as large as 15 and 24 kilobases and to effectively generate 
ssDNA of that size. In a further embodiment, larger DNA inserts, as large as 50 to 100 kb are 
inserted into the Ad-pHagemid of the invention; with generation of full length ssDN A 
corresponding to those large inserts. Generationof such large ssDN A fragments provides a 
means to evolve, /. e. modify by the recursive recombination methods of the invention, entire 
viral genomes. Thus,' this invention provides for the. first time a unique phagemid system 
capable of cloning large DNA inserts (>10KB) and generating ssDNA m vitro ^nd in vivo 
corr,esponding to those large inserts, 

The foregoing description of the preferred embodiments of the present 
invention has been presented for purposes of illustration and description. They are not 
intended to be exhaustive or to limit the invention to the precise form disclosed, and many 
modifications and variations are possible in light of the abbve'teaching/ Such modifications 
and variations which may be apparent to a person' skilled in the art are intended to be within- 
the scope of this invention. All patent documents and publications cited above are , 
incorporated by reference in their entirety for all purposes to the same extent as if each item . 
- were so individually denoted. 
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1. A methodofevolyinga nucleic aci 
V (I ) recombining at/least first and second forms of the segment differing from each . 
5 other in at least two. nucleotides, to produce a h*brar>' of recombinant segments; , 

(2) screening at least one recombinant segment from the .library for a property useftil 
in gene* therapy; , ; . ^ . 

, (3) recombining at least one recombinant segment with a further foiTn of, the segment 
' the same or different from the first and second forms, to produce a further library of 
3 ' recombinant segments; ' 

. (4) screening at least one further recombinant segment from the further ]ibrar>^ for 
improvenient in the property useful for gene therapy; 

■ (5) repeating (3) and (4). as necessary, until the further recombinant segment confers, 
a desired level of the property useful for gene therapy. ^ - . . . .. 

2. ' The method of ciaim I- wherein atjeast one recombiriing step occurs Z;? vivo. 

3. The rhethod of claim 1 wherein at least one recombining step occurs in vitro. 

1 4. The method of claim 1 , wherein the nucleic acid segment is a viral nucleic acid 

segment. - , . . 

S: The method of claim 4; wherein the viral nucleic acid segment is a viral vector. 

1 6. The method: of claim 5, wherein the viral vector is selected from the group 
r ^ consisting of retroviral, adenoviral, adenoassociatecl and herpes viral vectors. 

1' 7. The method of claim 6,. wherein the property is improved 

2 recombinant segnients^are screened as'components of viruses , by propagation of the viruses on 

3 -. h cells for multiple generations arid isolation of progeny viruses, the progeny viruses being 

, enriched for . viruses having recombinant segments confenring the property of improved titer... 
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8. The. mcthod of claim 6. wherein the property is improved viral infectivity. ; - . 

9. The method.of claim 8, wherein the recombinarit segments are screerie^ 
components of viruses by determining the percentage of a population of cells infected by a 
virus. ■ ■ ' 

■ 10. The method of claim 1, wherein the property is impro ved expression of a gene 
within the nucleic acid segment, and recombinant segments are screened by detecting 
expressing ofthe recombinant segments within cells. , 

11. The method of claim 1, wherein property is improved or altered drug resistance 
and the recombinant segments are screened by exposing the ceils to the drug and sclecimg 
• surviving cells, the.surviving cells being enriched for recombinant segments having the 
property of improved or altered drug resistance. • 

■ 12. The method of claim 1 1 . wherein the nucleic acid segment encodes a multidrug' 

■ ■ " * ■ ' . * ■ ■ 

resistance gene. \ , 



1 

gene. 



3. The method of claim 11. wherein the nucleic acid segment encodes an MGMT 



■ 14. ThemethodofclaimlUwherem the cells arc stem cells and the drug is a 
chembtherapeuticdrug . 

L 1?,: The method of Claim 1 1, further comprismg increasing the concentration of the 

drug between successive rounds of screening. . • 

1 16. The'methodbfclaim5,whereinthepropertyistissuespecificity.a.dthe 

2 recombinant segments are screened as components of viruses by contacting the viruses with a 

3 first population of cells for which the property of infectivity by the virus is desired and a 

4 second population of cells for which the property of infectivity by the virus is not desired and 

5 .solating progeny virus from the first population of cells, the progeny viruses being enriched 
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1 for recombinant segments conferring the property of infectiyity for the first subpopuiation of 
cells. / . 

1 1 7. The method of claim 1 6, wherein the first population of cells.are neoplastic cells 

and the second population of cells are nonneoplastic cells. 

1 18. The method of claim 1 7, wherein tlie first and second population of cells are 

present in a mammal. 

1 19. The method of claim 18, wherein the first and second populations are in mixed 

cell culture. 

1 20. The method of claim 19, wherein the first population of cells bears marker to 

allow distinction of the first population of cells from the second population of cells. 

i 21. The method of claim 5, wherein the nucleic acid segment encodes a viral capsid 

protein. 

1 22. The method of claim 2 1 , wherein the viral capsid protein is a fusion protein 

2 comprising a virai protein fused in-frame to a ligand with affinity for a receptor on the surface 
of the first population of cells. 

1 23. The method of claim 5, wherein the property is viral genome capacity, and 

2 -recombinant segments are screened as components of viruses by propagating the viruses on 

3 cells and isolating progeny viruses containing the recombinant segmerits, and the method 

4 further comprises increasing the genome length of viruses containing the recombinant 
segments between successive screening steps. . . * 

1 . ' 24. The method of claim 1, wherein the property is episomal retention, and cells 

2 containing the recombinant segments are screened by propagating the cells without selection 

3 for the recombinant segments and then propagating the cells with selection for the 



75 



PCTAJS97/17300 

WO 98/13487. 

1- . recombinant segments, the cells surviving' selection being enriched for cells harboring 
recombinant.segments with the property ofimprovedepisomai retention. ■; ; 



1 



25. The method of claim 1 . wherein the property is reduced itmriuriogenicity of the 

2 recombinant segments or an.expression product thereof and the recombinant segments are 

3 screened by introducing the recombinant segments into a mammal and recovering surviving 

- ■ ' * ...... ' J . , ■ ' . . . ' ■ . 

recombinant segments after a period of time. ■ 

1 '■ 26.' The method of claim 25, vyhcrciii the recombinant segments are introduced into ■ 

the mammal as components of viruses. . . 

1 ; 27. .The'rnethod of claim 1, wherein the propenyi^ 

2- recombinant segtnents are screened by intrpducing them; into cells and recovering a region of 

3 cellular DNA including the desired site of inlegratioru the region being enriched for ' 

■ recombinam segments vvith the. property of .site-specinc integration. ^ ■ 



!■ 
2 
3 



28. the method of claim 27, wherein the nucleic acid segment is an adenoassbciated 
viral DNA vector! and the recombinant DNA segments are screened as components of, 
'adenoassociated viruses, by introducing the viruses into cells an&recovering the region of 
.ceilularDNAfromchromosonie I9ql3.3. . ■ . ■ ■ 



1 '\ 29; The method of claim 5, wherein the property is increased stability, and the 

2 recombinant segments are screened as components of viruses by subjecting the viruses to 
3' ^destabilizing conditions ahdi recovering surviving viruses, these viruses being enriched for 
,■ . recombinant segmentfconfen-ing the property. ^ • . . : . ' . 

1 ' ,' ', 30. The method of dfaim 29. whereiii thedestabilizing conditions are elevated' , 
teiriperature, mechanical disruption, chemical. denaturation or biological degradation. 

1 3 1 . The rhethod of claim 5, wherein the first and second forms of the nucleic acid 

segment are from different strains of a virus. 
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1 : 32. The method of claim - 1, wherein the second form of the nucleic acid segment 

results from error-prone PCR replication of the first form of the nucleic acid segment; 

1 33. The method of claim 1 , wherein the property is capacity to confer cellular 

resistance to microorganism infection. 

1 34. The method of claim 33, wherein the nucleic acid segment encodes an anlisense 

2 transcript complementary to a nucleic acid sequence in the microorganism or an apiamer that 
specifically binds to protein of the microorganism. 

1 .35. The method of claim 33, wherein the nucleic acid segment encodes a gene 
expressing a protein conferring protection from the microorganism. 

36. The method of claim 35, wherein the protein is Rev MIO and the virus is HIV. 

1 37. The method of claim 35. wherein.ihe protein is from a first virus and ths^r' 

microorganism is a second virus. 

1 38. The method of claim 37, wherein the first virus is an adenoassociated virus and 

the second virus is HIV. 

1 39, The method of claim 5, wherein recombinant segments are screened by infecting 

2 cells containing the recombinant segments with the microorganism and recovering ' > 
recombinant segments from surviving cells. 

1 40. The method of claim 1, wherein the nucleic acid segment comprises a coding 

2 sequence encoding a protein or antisense RNA, which can be expressed after integration of 
the segment into genomic DNA of mammalian cells. " . 

41 . The mediod of claim 40, wherein the nucleic acid segment is nonviral. 
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1 ■ 42. the method of claim 41, wherein the nucleic acid segment further comprises 

.2 regulatory sequences operably linked to the coding sequences, sequences to enhance cellular 
3 uptake and sequences to enhance integration ofthe nucleic acid segmem into a cellular 

genome. 

1 43. The method ofclaim 42, wherein recombinant segments are selected by 

2 ititroducing the ■recombinant segments in nonviral form into a mamrnal, recovering cells from 

3 the mammal into which the segments are integrated and expressed to produce the protein or 
' ■ antiSenseRNA,. and recovering the recombinant segments from the cdb^^^ . 

1 ' 44. The methodofclaim 43, wherein the recombinant segments arc selected by 

2 introducing the recombinant segments in nonvira! form into mammalian ceils in culture and 

3 recovering cells into which the segments are integrated and expressed to .produce the protein 
or the antisense RNA. and recovering the recombinant segments from the cells; 



' ' 45. The method of claim i , wherein the nucleic acid segment encodes a virar protein, 
the property is capacity of a cell line containing t.he nucleic acid segment to package viral 
DNA transfected into the cell line, and recombinant segments are screened by transfecting 
cells containing the recombinant segments with a viral vector and determining the;yield of ' 
5 progeny viruses containing the viral vector produced by different cells, cells givmg a high 
yield beihg enriched for recombinant segments with enhanced packaging capacity. 

1 46. the method of claim 1, wherein , nucleic acid segment encodes adenoassociated 

viral proteins rep and cap, and the viral vector is an adenoassociated virus.' , 

1 47. The method bfclaim 46, wherein the cells conuining the recombinant segments^ 

2 - are infected with a helper virus, andthe method further comprises determining the yield of 

3 / progeny helpe^ virus produced by different cells and selecting ceils having a high relative 
* ' yield ofvirusescontainirig viral DNA to helper viruses. , 



1 
2 



48. The method of claim 1 , wherein the nucleic acid segment encodes an 
adenoassociated viral rep protein, the property is reduced toxicity ofthe rep protein to a cell 

78 
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1 line encoding the nucleic acid segment, and the recombinant segments are screened by 

2 propagating cells containing the recombinant segments for several generations and measuring 
viability or propagation rate of cells containing the recombinant segments. 



1 49. The method of claim 1 , wherein the segment encodes a DNA binding protein, the 

2 property is enhanced is enhanced uptake by a recipient cell of a vector encoding the DNA 

3 binding protein, and the recombinant and further recombinant segments are screened as 

4 components of the vector, which further encodes a selective marker, by 

5 (a) introducing the vector into ceils, and cukuring the cells whereby in an individual 

6 cell, a recombinant segment is expressed that binds to the vector in the cell as a complex, 

7 (b) lysing the cells under conditions in which the complexes remain intact, 

8 (c) contacting the complexes with recipient celts under conditions favoring uptake of 

9 the complexes, 

10 (d) screening for recipient cells expressing the selective marker, these cells being 

11 enriched for recombinant segments encoding DNA binding proteins having acquired the 
property. - 

1 50. A method of evolving a cell to package a virus, comprising: 

2 ( I ) transforming a culture of cells encoding viral packaging proteins with a library of 

3 DNA fragments at least some of which undergo recombination with cognate segments in the 

4 genome of the cells to produce modified cells; 

5 (2) transfecting the modified cells with viral DNA, and determining the yield of 

6 . progeny viruses produced by different cells, 

7 . (3) transforming the modified cells giving the highest yields of virus with a further 

8 library of DNA fragments at least some of which undergo homologous recombination with 

9 cognate segments in the genome of the modified cells to produce further modified cells; 

10 (4) transfecting the further modified cells with viral DNA and determining the yield . 

11 of progeny viruses produced by different cells, 

12 (5) repeating (3) and (4) as required until the further modified cells have acquired the 
desired function. 
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1 ' 5 1 . A phagemrd-adenovirus capable of generating single stranded DNA greater thiin ^ 

2 10 kilobases comprising an adenovirus and a phag^^ 

4 52. A. O^-niethylguanine-DN A methyltransferase (MGMT) enzyme having at least ■ 

5 orve amino acid segment present in a natural human MGMT coding sequence and absent jn a 

6 natural nonhuman fylGMT coding sequence, and having at least one amino acid segment ' 

7 present in the natural nonhuman MGMT coding sequence and absent in^the natural human 

8 MGMT coding sequence. V 

l6 \ 5'3..The enzyme of claim's, wherein the natural nohHuman MGMT coding sequence 
11 ' is from mouse, rat^rabbit or hamster. 

13 ■ ' . . • : 54. ■ An isolated O'-methylguaninc-DNA methyltransferase (MGMT) enzyme 

14 comprising a protein encoded by SEQ ID N0:1. ■ _ 

-15 ' ' .■■ . . ^ \ ■ . ; . ■ ■ '] . ■ ^ ■ 

16 55. An^expression vector comprising the O^-methylguanin'e-D^^^^ 

17/ methy Itransferase. (MGMT) enzyme of claim 15^ . . ' ; ' 

19 , " 56. . A host eell comprising the expression vector of claim 16. ^^^^^ ■ 

.20 . ; ' """[ \ ' 'V; ■ ' ;\ , ' : ' / ' . ' ■ ■ ' ■■■ '[ - . ■ /■ ^ ; 

'21. ; . ; 57// . A transgenic animal comprising the ex^^^ \ 
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Figure 3 

. Degenerate oligos for spiking mammalian diversity into human MGMT 

A1. CCTTAAGGAGGGGAAAAATG GCC GAG AYT TGT AAA ATdAAAnarArrAr Arrnr.A 

A2. AAATGGACAAGGATTGTGAA CTG AAA TAG AWK GTG TTC GACAGCCCTTTGGGGAAGCT 

A3. ■ AAATGAAACGCACCACACTG SMC AGC CCT TTG GGG GCG ATR GAGCTGTCTGGTTGTGAGCA 

A4. - TGGAGCTGTCTGGTTGTGAG CGG GGT CTG CAC nCT AT AAAarr r.r.mnnr a Annn 

A5. AGCAGGGTCTGCACGAAATA CGG TTC C TG AGC GGG AAG ACGTrTGrAnrrnATfirrnT 

A6. AGCTCCTGQGCAAGGGGACG CCT ARM WCT GAT CCC AMA GAGGTCCCAGCCCCCGCTGC - 

A7. CTGCAGCTGATGCCGTGGAG GCC CCA GC C W5C CCT GAG KKG CTCGGAGGTCCGGAGCGCCT 

A8. CGGTTCTCGGAGGTCCGGAG TCC CTG GTG CAG TGC GAA ACC TGGCTGAATGCCTATTTCCA 

A9. TGCAGTGCACAGCCTGGCTG SAW GCC TAT TTC CRA GAG CCCGAGGCTATCGAAGAGfT 

AlO. ATGCCTATTTCCACCAGCCC KCG GCT ACC CCA GGG CTG CCCGTGCCGGCTCTTCACCA 

All. AGGCTATCCAAGAGTTCCCC n^, CCGGCTCTTCACCATCCCGT ' ' 

Al 2. ACCATCCCGTTTTCCAGCAA fi^I TCGTTCACCAGACAGGTGTT ' 

, A13, AGGTTGTGAAATTCGGAGAA AYfi fill TCTTACCAGCAATTAGCAGC 

A14. CAGTGGGAGGAGCAATGAGA AfiC AATCCTGTCCCCATCCTCAT 

Al 5. TCATCCCGTGCCACAGAGTG ATC CGC AGC RAC GGA TCC ATT GGCAACTACTCCGGAGGACT 

Al 6. GCAGCAGCGGAGCCGTGGGC CAC TAC TCC GGA GGA CAfi GCCGTGAAGGAATGGCTTCT 

Al 7. GGCTTCTGGCCCATGAAGGC TYC CCG AMG AGG CAG CCA GCC TTGGGGAAGCCAGGCTTGGG 

A18. GGTTGGGGAAGCCAGGCTTG T5T AAG GRC TTA GCT CTG AYT GGGGnrTfinrrr AAnnr. Ar^r " 

, A19. GGAGCTCAGGTCTGGCAGGG WCC CGG CTCAAGGGAGCGGGAGCTAr 

A20. • TGGCAGGGGCCTGGCTCAAG YCA TCG TTC GRG TCC TCGGGCTCCCCGCCTGCTGG , 

A21 . TCAAGGGAGCGGGAGCTACC ACG AGC CC C RAG CTT TCT GGCCGAAACTGAGTATGAAG 

Each oligo contains 20 bp homology to human cDNA 5' and 3', and 

up to 21 bp non-homology (undeflinadl to incorporate all known mammalian diversity. 

lUPAC ambiguity codas are used. 
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MDKDCEMKRTTLDSPLGKLELSGCEQGLHEIKLLGKCTSA -HUMAN 
. AEI .K. . Y. V. . . . . , . I . .... .R. . .G.RF. SGK.PN RAT ' 

AET.K. . YSy. ...... .M. .... .R. . . G . R ... SGK . PN . .MOUSE ' 

..AET.K. . Y.VFH. . . . . I . . C . . . R . . .G.RF,SGK:PS ; HAMSTER 

. . . T.DL. YK. . A. . . .AI., R . . . S . R . ? . . K . PE \ RABBIT 



41 ADAVEVPAPAAVLGGP----EPLMQCTAWLNAYFHQPEAI -HUMAN 



41 T-:PT. 

41 T.PT 

41. S.PK 

41- . .PA 



.jrPE 



SPEL 



, EFVP . . ..V E. 

,EFVP, . .V. '. . . . .E 
.EDLE.S.V. .,..,T.,.E 
KRTP .■ . V . . E . . . H 



. .E. A.T RAT 

.RE. A;T . MOUSE .;' 

. QE;. A. T.. HAMSTER 

. .E.S. . . RABBIT ^ 



77. EEFPVPALHHPVFQQESFTRQVLWKLLKWKFGEVISYQQ HUMAN 

81 .GL.L. ... . . , V. . ,D' .' ., ; .MV. . . RAT 

81 . . GL .-L -. , ■. . . . . b . . ; . . ..... . . . . \ .... TV . : . .MOUSE 

81 .GL:L. ;.. .p. .. '.MV. . . . ■ HAJ^JSTER 

81 p . L'. . . v. ':. . : , . . v . . . rabbit. ' 
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121 
121 
,121 
121 



LAALAGN?KAA.R.W.GGA^1RGNPVP XL ; 
. . . . . . ... .S. . .... , 

.• s ...... , 

. . . ; : •. N . .-. . .. . , 



'C.HR'/VGSSGAVG 
.... IR-. D . •. I . 
.: . . . .R.D-. : I., 
. .'. .1 . .N-.SI. . 



■ HUMAN- 
RAT .•■ 

-MOUSE' 
.HAMSTER- 



S I . .-. . . . . . RABBIT 



157 NYSGG-LAVKEWLLAHEGHRLGKPGLGGSSGLAGAWLKGA HUMAN 
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l&l 
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. .GQT, 
. .GQ. ^ 
. .GQ. 
. .G... 



I?T,.Q.. A-SKGL; 
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A.k ' . • 



. IvS. • .PS. RAT 
. T.T : . . SS. MOUSE •■ 
. T.TR. .PS ■ HAMSTER 
RABBIT 



196 GATSGSPPAGRN . 
2 0,0 FES .SPK.S .■ . 
200 FESTS.E.S . . . .' 
200 ;GSTS .KLS . . 

181." 



HUMAN 
RAT 

MOUSE'-'';. 
HAMSTER 
RABBIT' - 



F/G. 4. 
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